Huihui-Qwen3.5-9B-abliterated-Grimoire-SimPO

Testing grimore's SimPO implementation.

Learning rate was too high on this and the resulting model is unusable.

Training Configuration

Parameter Value
Training Mode SIMPO
Base Model huihui-ai/Huihui-Qwen3.5-9B-abliterated
Learning Rate 9e-05
Epochs 1
Batch Size 1
Gradient Accumulation 32
Effective Batch Size 32
Max Sequence Length 2048
Optimizer paged_adamw_8bit
LR Scheduler cosine
Warmup Ratio 0.05
Weight Decay 0.01
Max Grad Norm 0.25
Seed 42
Beta 0.1
Max Prompt Length 1024
SimPO Gamma 0.5
LoRA Rank (r) 128
LoRA Alpha 64
LoRA Dropout 0.05
Target Modules up_proj, down_proj, gate_proj, k_proj, q_proj, v_proj, o_proj
Quantization 4-bit (NF4)
GPU NVIDIA RTX A6000

Trained with Merlina

Merlina on GitHub

Downloads last month
49
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SimPO

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(8)
this model
Quantizations
2 models

Dataset used to train nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SimPO

Collection including nbeerbower/Huihui-Qwen3.5-9B-abliterated-Grimoire-SimPO