open-llama-3b-opus-reasoning-sft-lora-2k

LoRA adapter for openlm-research/open_llama_3b_v2.

Training

Base: open_llama_3b_v2 (4-bit NF4)
Method: QLoRA (r=32, alpha=32)
Data: Crownelius/Opus-4.6-Reasoning-3300x + Roman1111111/claude-opus-4.6-10000x
Samples: 11,669 (filtered <= 2024 tokens)
Epochs: 1
Batch size: 8
LR: 2e-4 (constant)
Training: assistant-only (system/user turns masked)
Special tokens: Qwen3.5 chat template (<|im_start|>, <|im_end|>, , , etc.)
modules_to_save: embed_tokens, lm_head

Format

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
<think>
{reasoning}
</think>

{answer}<|im_end|>

Merged model

See ping98k/open-llama-3b-opus-reasoning-sft-2k-4bit for the merged 4-bit model.

Downloads last month: 54

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ping98k/open-llama-3b-opus-reasoning-sft-lora-2k

Base model

openlm-research/open_llama_3b_v2

Adapter

(65)

this model