Compressed Model: MilyaShams/Qwen3-1.7B-Pipe_GPTQ_W8A8_SparseGPT24

This model was compressed using the llmcompressor framework.

Compression Details

  • Base Model: Qwen/Qwen3-1.7B
  • Experiment Name: Pipe_GPTQ_W8A8_SparseGPT24
  • Recipe / Modifiers Applied:
[GPTQModifier(config_groups=None, targets=['Linear'], ignore=[], scheme='W8A8', kv_cache_scheme=None, weight_observer=None, input_observer=None, output_observer=None, observer=None, bypass_divisibility_checks=False, index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, sequential_targets=None, block_size=64, dampening_frac=0.05, actorder=static, offload_hessians=False), SparseGPTModifier(index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, sparsity=0.5, sparsity_profile=None, mask_structure='2:4', owl_m=None, owl_lmbda=None, sequential_update=False, sequential_targets=['Qwen3DecoderLayer'], targets=['Linear'], ignore=[], block_size=64, dampening_frac=0.01, preserve_sparsity_mask=False, offload_hessians=False)]

Note: This model card was automatically generated. All structural modifiers and parameters used during compression are logged above.

Downloads last month
223
Safetensors
Model size
1B params
Tensor type
I64
F16
I8
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for MilyaShams/Qwen3-1.7B-Pipe_GPTQ_W8A8_SparseGPT24

Finetuned
Qwen/Qwen3-1.7B
Quantized
(254)
this model