OmniCoder-9B CoreGen HDLFix v2 GGUF

This repository contains the GGUF export set for the merged BF16 release of omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1.

It is the deployment-oriented companion to the Transformers-format merged checkpoint. The artifacts here were generated from the local merged BF16 model and are intended for LM Studio and llama.cpp style runtimes.

Included Files

All main model GGUF files in this repo depend on the included mmproj file for multimodal use.

File	Approx. size	Notes
`mmproj-omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.f16.gguf`	`0.855 GB`	multimodal projector required for image-aware use
`omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.bf16.gguf`	`16.690 GB`	highest-fidelity GGUF export
`omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q8_0.gguf`	`8.873 GB`	highest standard integer quant in this release
`omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q6_K.gguf`	`6.854 GB`	good quality / size tradeoff
`omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.Q5_K_M.gguf`	`6.024 GB`	smaller deployment option
`omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1_merged_bf16.MXFP4.gguf`	`4.947 GB`	experimental dense MXFP4 export

The repo also includes gguf_quantize_summary.json, which records the exact local conversion and quantization paths used for this artifact set.

Quantization Notes

Q8_0, Q6_K, and Q5_K_M were exported through the standard Windows GGUF pipeline.
The dense MXFP4 GGUF is experimental and was produced using a patched local llama.cpp build rather than the stock release binary.
For this model family, llama-quantize emitted repeated position_embd and token_types formatting warnings during quantization, but the process completed successfully and the resulting dense MXFP4 artifact was produced.
The dense MXFP4 file has been validated as loadable in LM Studio on this machine.

Recommended Artifact Choice

Use Q6_K as the default starting point if you want a balanced quality / size local deployment.
Use Q8_0 if you want the strongest standard quantized option and can afford the memory.
Use Q5_K_M if you need a smaller local footprint.
Use MXFP4 only if you specifically want to test the experimental dense MXFP4 path and your runtime supports it.
Use BF16 if you want the least quantization loss and have enough memory for the full file.

Runtime Notes

This is a multimodal OmniCoder-family export. For image-aware usage, pair any main model file with the included mmproj file.

In LM Studio, load the model GGUF together with the matching projector when multimodal features are needed.

In llama.cpp style runtimes, the exact multimodal command depends on the frontend you use, but the practical rule is the same: the main GGUF and the mmproj GGUF belong together.

Relationship to the Training Run

These GGUF files come from the merged BF16 release of a local fine-tune based on armand0e/OmniCoder-9B-Claude-Opus-High-Reasoning-Distill.

The underlying run focused on correcting earlier weaknesses in HDL behavior by:

training across a full effective epoch instead of a short pilot run
stripping visible reasoning markup from supervision
removing leaked HDL target text from user prompts
increasing HDL and bus/peripheral coverage in the training mix

The prepared dataset used for the run contained 3120 train examples and 428 eval examples, with explicit emphasis on HDL, code review, embedded programming, tool use, and math-heavy coding prompts.

Limitations

This GGUF release inherits the limitations of the underlying merged model.
It is intended for practical local inference, not as a guarantee of correctness for RTL signoff, firmware safety, or hardware interface behavior.
The dense MXFP4 artifact is experimental and depends on a patched local llama.cpp path rather than a stock upstream release.
A broad external benchmark suite for this run is still missing; the main validated numeric signal remains the internal held-out eval loss of the original training run.

Companion Release

If you want the original Transformers-format merged checkpoint for further conversion or continued experimentation, use the separate merged BF16 repo: