OmniCoder-9B CoreGen HDLFix v2 GGUF

This repository contains the GGUF export set for the merged BF16 release of omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1.

It is the deployment-oriented companion to the Transformers-format merged checkpoint. The artifacts here were generated from the local merged BF16 model and are intended for LM Studio and llama.cpp style runtimes.

Included Files

All main model GGUF files in this repo depend on the included mmproj file for multimodal use.

File Approx. size Notes
mmproj-omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.f16.gguf 0.855 GB multimodal projector required for image-aware use
omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.bf16.gguf 16.690 GB highest-fidelity GGUF export
omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q8_0.gguf 8.873 GB highest standard integer quant in this release
omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q6_K.gguf 6.854 GB good quality / size tradeoff
omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q5_K_M.gguf 6.024 GB smaller deployment option
omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.MXFP4.gguf 4.947 GB experimental dense MXFP4 export

The repo also includes gguf_quantize_summary.json, which records the exact local conversion and quantization paths used for this artifact set.

Quantization Notes

  • Q8_0, Q6_K, and Q5_K_M were exported through the standard Windows GGUF pipeline.
  • The dense MXFP4 GGUF is experimental and was produced using a patched local llama.cpp build rather than the stock release binary.
  • For this model family, llama-quantize emitted repeated position_embd and token_types formatting warnings during quantization, but the process completed successfully and the resulting dense MXFP4 artifact was produced.
  • The dense MXFP4 file has been validated as loadable in LM Studio on this machine.

Recommended Artifact Choice

  • Use Q6_K as the default starting point if you want a balanced quality / size local deployment.
  • Use Q8_0 if you want the strongest standard quantized option and can afford the memory.
  • Use Q5_K_M if you need a smaller local footprint.
  • Use MXFP4 only if you specifically want to test the experimental dense MXFP4 path and your runtime supports it.
  • Use BF16 if you want the least quantization loss and have enough memory for the full file.

Runtime Notes

This is a multimodal OmniCoder-family export. For image-aware usage, pair any main model file with the included mmproj file.

In LM Studio, load the model GGUF together with the matching projector when multimodal features are needed.

In llama.cpp style runtimes, the exact multimodal command depends on the frontend you use, but the practical rule is the same: the main GGUF and the mmproj GGUF belong together.

Relationship to the Training Run

These GGUF files come from the merged BF16 release of a local fine-tune based on armand0e/OmniCoder-9B-Claude-Opus-High-Reasoning-Distill.

The underlying run focused on correcting earlier weaknesses in HDL behavior by:

  • training across a full effective epoch instead of a short pilot run
  • stripping visible reasoning markup from supervision
  • removing leaked HDL target text from user prompts
  • increasing HDL and bus/peripheral coverage in the training mix

The prepared dataset used for the run contained 3120 train examples and 428 eval examples, with explicit emphasis on HDL, code review, embedded programming, tool use, and math-heavy coding prompts.

Limitations

  • This GGUF release inherits the limitations of the underlying merged model.
  • It is intended for practical local inference, not as a guarantee of correctness for RTL signoff, firmware safety, or hardware interface behavior.
  • The dense MXFP4 artifact is experimental and depends on a patched local llama.cpp path rather than a stock upstream release.
  • A broad external benchmark suite for this run is still missing; the main validated numeric signal remains the internal held-out eval loss of the original training run.

Companion Release

If you want the original Transformers-format merged checkpoint for further conversion or continued experimentation, use the separate merged BF16 repo:

  • tianrui6641/omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1-merged-bf16

This GGUF repo is the deployment-oriented companion to that release.

Downloads last month
199
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tianrui6641/omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1-gguf