dbw6/Qwen3-30B-A3B-Instruct-2507-AQLM-2Bit-2x8-hf

This repository contains a Hugging Face export of Qwen3-30B-A3B-Instruct-2507 quantized with AQLM using the 2-bit 2x8 scheme.

Base model

  • Qwen/Qwen3-30B-A3B-Instruct-2507

Quantization

  • Method: AQLM
  • Scheme: 2x8
  • Effective label: 2-bit
  • Source checkpoint: /work/bduan1/quantized_models/Qwen3-30B-A3B-AQLM-2bit-2x8

Conversion

This repo was produced with convert_to_hf.py from the AQLM project, then exported with --save_safetensors and --save_tokenizer.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "dbw6/Qwen3-30B-A3B-Instruct-2507-AQLM-2Bit-2x8-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)
Downloads last month
321
Safetensors
Model size
8B params
Tensor type
F16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dbw6/Qwen3-30B-A3B-Instruct-2507-AQLM-2Bit-2x8-hf

Quantized
(116)
this model

Collection including dbw6/Qwen3-30B-A3B-Instruct-2507-AQLM-2Bit-2x8-hf