shinich001
/

qwen3-4b-h100-v5-hard-ep3

Text Generation

structured-output

Model card Files Files and versions

qwen3-4b-h100-v5-hard-ep3

Top-ranker strategy model. Trained on H100 with a blend of three datasets (approx. 14k rows) and heavily preprocessed with custom clean_assistant_output_v2 (CoT stripping, markdown removal, TOML comment removal).

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Max sequence length: 4096
Epochs: 3
Learning rate: 2e-5
Effective Batch size: 32 (BS=8, GradAccum=4)
LoRA R: 128

Downloads last month: 245

Model tree for shinich001/qwen3-4b-h100-v5-hard-ep3

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5273)

this model

Datasets used to train shinich001/qwen3-4b-h100-v5-hard-ep3