Llama-3.2-3B-Instruct aligned using DPO on the argilla/ultrafeedback-binarized-preferences
- Downloads last month
- 289
Model tree for MInAlA/Llama-3.2-3B-DPO-merged
Base model
meta-llama/Llama-3.2-3B-InstructLlama-3.2-3B-Instruct aligned using DPO on the argilla/ultrafeedback-binarized-preferences
Base model
meta-llama/Llama-3.2-3B-Instruct