OmniCoder-9B CoreGen HDLFix v2 Merged BF16
This repository contains the fully merged BF16 checkpoint for a local fine-tune of armand0e/OmniCoder-9B-Claude-Opus-High-Reasoning-Distill targeted at practical coding work with stronger HDL and embedded behavior.
The run behind this release is omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1. It was built to fix the main issues seen in an earlier short-run adapter: too little effective dataset coverage, visible reasoning markup leaking into supervision, and weak coverage around bus and peripheral tasks such as Wishbone and SPI.
What Changed Relative to the Earlier Core-Generalist Run
- The training run was changed from a short fixed-step pilot to a full epoch over the cleaned
coregen_hdlfix_v2prepared set. - Visible
<think>and<answer>scaffolding was stripped out of supervision. - HDL prompt-output leakage from
verilog-instructstyle data was excluded from the user side of the conversation format. - HDL coverage was expanded with focused sources such as
HDL-Instruct,expanded_rtlcoder,RTLLM, andverilogeval-v2-spec-to-rtl. - Bus and peripheral sampling was deliberately increased. In the prepared train split used for this run, the relaunch notes recorded about
19wishbonementions and about78HDL-sidespimentions.
Intended Use
This model is intended as a practical coding assistant with emphasis on:
- RTL and HDL generation, explanation, and review
- bus and peripheral oriented Verilog tasks
- general code generation and patching
- code review style feedback
- embedded and firmware-adjacent tasks
- basic tool-use and math-heavy coding prompts
It is a better fit for code-centric workflows than for general open-domain chat.
Training Data Summary
The merged model comes from a LoRA fine-tune on the prepared dataset prepared_omnicoder_mixed_data_coregen_hdlfix_v2.
- Train examples:
3120 - Eval examples:
428 - Sequence length:
1024
The prepared mix included these retained train and eval counts:
| Family | Train | Eval |
|---|---|---|
| HDL-Instruct including bus-focused sampling | 544 | 72 |
| expanded_rtlcoder including bus-focused sampling | 320 | 40 |
| CodeV-R1 | 320 | 40 |
| CodeV-SVA | 192 | 24 |
| verilogeval spec-to-RTL | 120 | 16 |
| RTLLM | 40 | 8 |
| codefeedback_instruction | 256 | 32 |
| code_feedback | 224 | 32 |
| commitpackft | 224 | 32 |
| github_codereview | 160 | 24 |
| github_code | 128 | 16 |
| stm32_hal | 160 | 24 |
| electronics_stackexchange | 96 | 16 |
| arduino_stackexchange | 64 | 12 |
| iot_stackexchange | 32 | 8 |
| qwen_toolcalling | 96 | 12 |
| toolscale | 48 | 8 |
| nemotron_math | 96 | 12 |
This is still a curated pilot-scale mix, not a frontier-scale training corpus. The purpose of the run was to harden the local 9B pipeline and correct the earlier HDL coverage failure, not to present a final benchmarked production release.
Training Configuration
- Base model:
armand0e/OmniCoder-9B-Claude-Opus-High-Reasoning-Distill - Training engine:
hf - Device:
cuda - LoRA rank:
64 - Training dtype:
torch.bfloat16 - Base load mode:
4-bit - Learning rate:
7e-5 - Gradient accumulation:
2 - Train epochs:
1.0 - Optimizer update steps:
1560 - Batch size per device:
1 - Trainable parameters:
116,391,936 - Total parameters reported by the training summary:
5,841,364,208 - Final eval loss:
0.6533
The merged release was exported as BF16 safetensors on GPU and written into four shards.
Files in This Repository
model-00001-of-00004.safetensors- about4.603 GBmodel-00002-of-00004.safetensors- about4.645 GBmodel-00003-of-00004.safetensors- about4.615 GBmodel-00004-of-00004.safetensors- about3.664 GBmodel.safetensors.index.json- tokenizer, processor, config, and chat template files
merge_summary.json
How to Use
This checkpoint follows the OmniCoder / Qwen3.5 multimodal stack and should be loaded with the same Transformers classes as the base model family.
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
model_id = "tianrui6641/omnicoder_local9b_blackwell_coregen_hdlfix_v2_hf_r64_epoch1-merged-bf16"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Write a Wishbone-attached SPI controller in Verilog and explain the register map."}
],
}
]
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, return_tensors="pt")
inputs = {key: value.to(model.device) for key, value in inputs.items()}
output = model.generate(**inputs, max_new_tokens=512)
generated = output[:, inputs["input_ids"].shape[1]:]
print(processor.batch_decode(generated, skip_special_tokens=True)[0])
For local inference via GGUF or LM Studio, use the separate GGUF release.
Limitations
- The model has not yet been validated against a full external benchmark harness for HDL, embedded development, code review, tool use, and math.
- The main numeric result available for this run is held-out eval loss on the curated internal mix, not a broad public benchmark suite.
- Although HDL behavior was explicitly improved, this is still not a replacement for verification flows, synthesis, linting, or formal checking.
- The model can still hallucinate APIs, bus semantics, timing assumptions, or reset behavior.
- Multimodal behavior is inherited from the base OmniCoder family, but this particular release was optimized primarily for coding tasks rather than general VLM use.
Recommended Usage Pattern
- Use this merged BF16 repo if you want to continue fine-tuning, convert again, or run the original Transformers-format checkpoint.
- Use the separate GGUF repo if you want a local inference package for LM Studio or
llama.cppstyle runtimes. - Prefer verification-oriented workflows for non-trivial RTL work: ask for testbenches, assertions, and interface assumptions explicitly.
Release Notes
This release is part of the local 9B hardening path for the TAIS-Coder mini project. The goal is not to mimic the closed Copilot Raptor stack exactly, but to produce an open practical coding assistant with stronger HDL, embedded, review, and tool-use behavior on local hardware.
The base model tags indicate Apache-2.0 licensing, and this release is intended to inherit the base model's applicable license terms.
- Downloads last month
- 2