Qwen2.5-1.5B-Orca-BD3LM-SFT

Model Description

This is a Block Diffusion Language Model (BD3LM) fine-tuned on Vietnamese Intel Orca dataset for instruction following and question answering tasks. The model is based on Qwen2.5-1.5B architecture with BD3LM diffusion approach.

Training Details

  • Base Model: ChaosAiVision/Qwen2.5-1.5B-ddlm-bd3lm-pretrain-5550-vi
  • Training Method: BD3LM (Block Diffusion Language Model) - SFT
  • Dataset: Vietnamese Intel Orca (5CD-AI/Vietnamese-Intel-orca_dpo_pairs-gg-translated)
  • Training Samples: 11,862 instruction-response pairs
  • Training Epochs: 3 epochs
  • Max Length: 1024 tokens
  • Block Size: 32 tokens
  • Batch Size: 2 per device × 4 gradient accumulation = 8 effective batch size
  • Learning Rate: 1e-4
  • Framework: dLLM (Diffusion Language Model Library)

Model Architecture

  • Architecture: A2D-Qwen2 (Autoregressive to Diffusion) with BD3LM
  • Hidden Size: 1536
  • Num Layers: 28
  • Num Attention Heads: 12
  • Num KV Heads: 2
  • Intermediate Size: 8960
  • Vocab Size: 151,936

Dataset Format

The model was trained on Vietnamese instruction-following data with:

  • System prompts (system_vi): Task instructions
  • Questions (question_vi): User queries
  • Answers (chosen_vi): Expected responses

Usage

import torch
import dllm
from transformers import AutoTokenizer

model_name = "ChaosAiVision/qwen2.5-1.5b-orca-bd3lm-sft-orca"

# Load model and tokenizer
model_args = type("Args", (), {"model_name_or_path": model_name})()
model = dllm.utils.get_model(model_args=model_args).eval().cuda()
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Setup BD3LM sampler
sampler = dllm.core.samplers.BD3LMSampler(model=model, tokenizer=tokenizer)
sampler_config = dllm.core.samplers.BD3LMSamplerConfig(
    steps=128,
    max_new_tokens=512,
    temperature=0.0,
    block_size=32
)

# Prepare messages
messages = [
    {"role": "system", "content": "Bạn là một trợ lý AI hữu ích."},
    {"role": "user", "content": "Thủ đô của Việt Nam là gì?"}
]

# Generate
prompt_ids = tokenizer.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True, 
    return_tensors="pt"
).cuda()

output = sampler.sample(inputs=[prompt_ids[0]], config=sampler_config)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)

Training Process

  1. Pretraining: Model was first pretrained on Vietnamese Wikipedia (50K samples, 5500 steps)
  2. SFT: Then fine-tuned on Vietnamese Intel Orca dataset (11,862 samples, 3 epochs)

Limitations

  • The model may generate repetitive text in some cases
  • Performance depends on inference parameters (steps, temperature, block_size)
  • Best results with steps >= 128 and appropriate temperature settings

License

Apache 2.0

Citation

@misc{qwen25-bd3lm-orca-sft,
  title={Qwen2.5-1.5B BD3LM Vietnamese Orca SFT},
  author={ChaosAiVision},
  year={2026},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/ChaosAiVision/qwen2.5-1.5b-orca-bd3lm-sft-orca}}
}
Downloads last month
4
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ChaosAIVision/qwen2.5-1.5b-orca-bd3lm-sft-orca

Dataset used to train ChaosAIVision/qwen2.5-1.5b-orca-bd3lm-sft-orca