Compressed Model: MilyaShams/Qwen3-1.7B-Pipe_GPTQ_W8A8_SparseGPT24
This model was compressed using the llmcompressor framework.
Compression Details
- Base Model: Qwen/Qwen3-1.7B
- Experiment Name: Pipe_GPTQ_W8A8_SparseGPT24
- Recipe / Modifiers Applied:
[GPTQModifier(config_groups=None, targets=['Linear'], ignore=[], scheme='W8A8', kv_cache_scheme=None, weight_observer=None, input_observer=None, output_observer=None, observer=None, bypass_divisibility_checks=False, index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, sequential_targets=None, block_size=64, dampening_frac=0.05, actorder=static, offload_hessians=False), SparseGPTModifier(index=None, group=None, start=None, end=None, update=None, initialized_=True, finalized_=True, started_=True, ended_=True, sparsity=0.5, sparsity_profile=None, mask_structure='2:4', owl_m=None, owl_lmbda=None, sequential_update=False, sequential_targets=['Qwen3DecoderLayer'], targets=['Linear'], ignore=[], block_size=64, dampening_frac=0.01, preserve_sparsity_mask=False, offload_hessians=False)]
Note: This model card was automatically generated. All structural modifiers and parameters used during compression are logged above.
- Downloads last month
- 223
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support