fieldvalley-llm2025/main_rev1_merged_dpo05

REV1 DPO05 (Fixed Steps Version).

  • Base: REV1 DPO03
  • Method: TOML Local DPO
  • Steps: 100 (Fixed)
  • Pairs: Increased with multi-type rejection
Downloads last month
2
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fieldvalley-llm2025/main_rev1_merged_dpo05

Base model

Qwen/Qwen2.5-7B
Finetuned
(3)
this model