Malaysian-Qwen2.5-3B-Instruct-Q4_K_M-GGUF
This is a GGUF quantized version of mesolitica/Malaysian-Qwen2.5-3B-Instruct, converted for local inference without GPU requirements.
Model Details
- Base Model: mesolitica/Malaysian-Qwen2.5-3B-Instruct
- Quantization: Q4_K_M
- Format: GGUF (llama.cpp compatible)
- File Size: ~2GB
- Language Focus: Malaysian/Malay language
- Use Case: Local inference, experimentation, CPU-only environments
Files
Malaysian-Qwen2.5-3B-Instruct-Q4_K_M.gguf- Quantized model file
Usage with Ollama
Install Ollama: Download from ollama.ai
Create a Modelfile:
FROM ./Malaysian-Qwen2.5-3B-Instruct-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER top_k 40
- Create the model:
ollama create malaysian-qwen -f Modelfile
- Run the model:
ollama run malaysian-qwen
Usage with llama.cpp
./main -m Malaysian-Qwen2.5-3B-Instruct-Q4_K_M.gguf -p "Your prompt here" -n 512
Quantization Info
Q4_K_M offers an optimal balance between model size and quality, suitable for most use cases on consumer hardware.
License
This model inherits the license from the original mesolitica/Malaysian-Qwen2.5-3B-Instruct model. Please refer to the original model card for license details.
Credits
- Original Model: Mesolitica
- Quantization: Converted using llama.cpp
Citation
If you use this model, please cite the original model:
@misc{malaysian-qwen2.5-3b-instruct,
author = {Mesolitica},
title = {Malaysian-Qwen2.5-3B-Instruct},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/mesolitica/Malaysian-Qwen2.5-3B-Instruct}
}
- Downloads last month
- 8
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for fahmifauzi/Malaysian-Qwen2.5-3B-Instruct-Q4_K_M.gguf
Base model
mesolitica/Malaysian-Qwen2.5-3B-Instruct