☕ Qehwa Pashto LLM — GGUF (Q8_0)

GGUF-quantized version of Qehwa — Pashto's first instruction-tuned LLM — optimized for macOS, LM Studio, and llama.cpp.

Original Model junaid008/qehwa-pashto-llm
Original Author Junaid Khan
Base Architecture Qwen2.5-7B
Quantization Q8_0 (~8GB)
Format GGUF
Compatibility llama.cpp, LM Studio, Ollama, GPT4All

🌟 About the Original Model

Qehwa is the first fully instruction-tuned Pashto LLM, created by Junaid Khan. It was built using a two-stage training pipeline:

  1. Continued Pre-Training (CPT) — Trained on Pashto documents to learn the language deeply
  2. Supervised Fine-Tuning (SFT) — Trained on high-quality Pashto instruction-response pairs

The model targets the Peshawari/KPK dialect of Pashto and can:

  • ✅ Answer questions in Pashto
  • ✅ Respond to English and Urdu instructions in Pashto
  • ✅ Creative writing, poetry, and storytelling in Pashto
  • ✅ Translation between Pashto, English, and Urdu

All credit for the model training goes to Junaid Khan. This repository only provides the GGUF conversion for local inference.


📥 Download

Direct Download (Fastest)

👉 Download from Google Drive

Using hf CLI

brew install hf
hf download hasnainayaz/qehwa-pashto-llm-gguf

Using huggingface-cli

pip install huggingface-hub
huggingface-cli download hasnainayaz/qehwa-pashto-llm-gguf --local-dir ./qehwa-model

HuggingFace Download

Click the Files tab above to download qehwa-q8_0.gguf directly.


🧩 Usage

LM Studio (Easiest)

  1. Download LM Studio
  2. Import the qehwa-q8_0.gguf file
  3. Start chatting in Pashto!

llama.cpp (CLI)

./llama-cli -m qehwa-q8_0.gguf -p "Below is an instruction in Pashto or English. Write a detailed response in Pashto.\n\n### Instruction:\nد پیښور تاریخ راته ووایه\n\n### Response:\n" -n 200

llama-cpp-python (Python)

from llama_cpp import Llama

llm = Llama(model_path="qehwa-q8_0.gguf", n_gpu_layers=-1)

output = llm(
    "Below is an instruction in Pashto or English. Write a detailed response in Pashto.\n\n"
    "### Instruction:\nد پیښور تاریخ راته ووایه\n\n### Response:\n",
    max_tokens=500,
    temperature=0.7,
    repeat_penalty=1.1,
    stop=["### Instruction:"],
)

print(output["choices"][0]["text"])

💬 Prompt Format

This model uses the Alpaca prompt template:

Below is an instruction in Pashto or English. Write a detailed response in Pashto.

### Instruction:
{your prompt here}

### Response:

Example Prompts

Language Prompt
پښتو د پیښور تاریخ راته ووایه
English Tell me about Pashtunwali
اردو پشاور کے بارے میں بتاؤ

🖥️ Web UI

A modern chat interface is also available at: 👉 github.com/hasnainkhan8532/PashtoLLM

Built with Next.js, Shadcn/UI, Bahij Badiya font, and full RTL support for Pashto.


⚙️ Quantization Details

Property Value
Source format SafeTensors (bfloat16)
Source size ~15.2 GB
GGUF quant type Q8_0
GGUF size ~8 GB
Conversion tool llama.cpp convert_hf_to_gguf.py

🙏 Credits


📜 License

This GGUF conversion follows the license of the original model. Please refer to the original repository for licensing terms.


GGUF conversion & Web UI by hasnainayaz.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hasnainayaz/qehwa-pashto-llm-gguf

Base model

Qwen/Qwen2.5-7B
Finetuned
(1)
this model