Compressed Model: MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_ign

This model was compressed using the llmcompressor framework.

Compression Details

  • Base Model: Qwen/Qwen3-1.7B
  • Experiment Name: SmoothQuant_0.5_AutoRound_ign
  • Recipe / Modifiers Applied:
[SmoothQuantModifier(index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, smoothing_strength=0.5, mappings=[LayerMap(balance_layers=['re:.*q_proj', 're:.*k_proj', 're:.*v_proj'], smooth_layers='re:.*input_layernorm'), LayerMap(balance_layers=['re:.*gate_proj', 're:.*up_proj'], smooth_layers='re:.*post_attention_layernorm')], ignore=[], num_calibration_steps=None, calibration_function=None), AutoRoundModifier(config_groups=None, targets=['Linear'], ignore=['lm_head'], scheme='W8A8', kv_cache_scheme=None, weight_observer=None, input_observer=None, output_observer=None, observer=None, bypass_divisibility_checks=False, index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, sequential_targets=['Qwen3DecoderLayer'], iters=200, enable_torch_compile=True, batch_size=2, lr=None, device_ids='0,1')]

Note: This model card was automatically generated. All structural modifiers and parameters used during compression are logged above.

Downloads last month
94
Safetensors
Model size
2B params
Tensor type
F16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_ign

Finetuned
Qwen/Qwen3-1.7B
Quantized
(254)
this model