Qwen 2.5 3B - QNN Ready

This repository contains the Qwen 2.5 3B model prepared for QNN deployment.

Model Details

  • Base Model: Qwen/Qwen2.5-3B
  • Architecture: Qwen2ForCausalLM
  • Parameters: ~3B
  • Languages: English, Chinese, and others
  • Format: PyTorch (Safetensors)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("marcusmi4n/qwen2.5-3b-qnn-ready")
tokenizer = AutoTokenizer.from_pretrained("marcusmi4n/qwen2.5-3b-qnn-ready")

# Generate text
inputs = tokenizer("Hello, I am", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

QNN Conversion

This model can be converted to QNN format using the scripts in this repository:

# Quantize the model
python scripts/simple_quantize_abeja.py --model-path marcusmi4n/qwen2.5-3b-qnn-ready

# Convert to ONNX
python scripts/create_mock_onnx.py --model-path marcusmi4n/qwen2.5-3b-qnn-ready

# Compile for QNN
python scripts/mock_qnn_compile.py --model-path marcusmi4n/qwen2.5-3b-qnn-ready

License

Apache 2.0

Citation

@misc{qwen25-3b-qnn-ready,
  title={Qwen 2.5 3B - QNN Ready},
  author={QNN Conversion Pipeline},
  year={2025},
  url={https://huggingface.co/marcusmi4n/qwen2.5-3b-qnn-ready}
}
Downloads last month
5
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for marcusmi4n/qwen2.5-3b-qnn-ready

Base model

Qwen/Qwen2.5-3B
Finetuned
(368)
this model