Compressed Model: MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs128_damp0.01

This model was compressed using the llmcompressor framework.

Compression Details

  • Base Model: Qwen/Qwen3-1.7B
  • Experiment Name: GPTQ_W4A16_bs128_damp0.01
  • Recipe / Modifiers Applied:
config_groups=None targets=['Linear'] ignore=[] scheme='W4A16' kv_cache_scheme=None weight_observer=None input_observer=None output_observer=None observer=None bypass_divisibility_checks=False index=None group=None start=None end=None update=None initialized_=True finalized_=True started_=True ended_=True sequential_targets=None block_size=128 dampening_frac=0.01 actorder=static offload_hessians=False

Note: This model card was automatically generated. All structural modifiers and parameters used during compression are logged above.

Downloads last month
129
Safetensors
Model size
2B params
Tensor type
I64
I32
F16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs128_damp0.01

Finetuned
Qwen/Qwen3-1.7B
Quantized
(254)
this model