Compressed Model: MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs128_damp0.01

This model was compressed using the llmcompressor framework.

Compression Details

Base Model: Qwen/Qwen3-1.7B
Experiment Name: GPTQ_W4A16_bs128_damp0.01
Recipe / Modifiers Applied:

config_groups=None targets=['Linear'] ignore=[] scheme='W4A16' kv_cache_scheme=None weight_observer=None input_observer=None output_observer=None observer=None bypass_divisibility_checks=False index=None group=None start=None end=None update=None initialized_=True finalized_=True started_=True ended_=True sequential_targets=None block_size=128 dampening_frac=0.01 actorder=static offload_hessians=False

Note: This model card was automatically generated. All structural modifiers and parameters used during compression are logged above.

Downloads last month: 129

Safetensors

Model size

2B params

Tensor type

I64

I32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MilyaShams/Qwen3-1.7B-GPTQ_W4A16_bs128_damp0.01

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Quantized

(254)

this model