Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output

qwen3-4b-h100-v5-hard-ep3

Top-ranker strategy model. Trained on H100 with a blend of three datasets (approx. 14k rows) and heavily preprocessed with custom clean_assistant_output_v2 (CoT stripping, markdown removal, TOML comment removal).

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Max sequence length: 4096
  • Epochs: 3
  • Learning rate: 2e-5
  • Effective Batch size: 32 (BS=8, GradAccum=4)
  • LoRA R: 128
Downloads last month
245
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shinich001/qwen3-4b-h100-v5-hard-ep3

Adapter
(5273)
this model

Datasets used to train shinich001/qwen3-4b-h100-v5-hard-ep3