🚨 Request: Proper Kurdish Language Support (Not Token Support)

#31
by yawner - opened

It’s frustrating to see that Kurdish continues to be overlooked or poorly supported across TTS and LLM systems, including Qwen3-TTS.

Kurdish is spoken by 30+ million people, yet current support is either:

  • Non-existent
  • Low quality / inaccurate
  • Confused with Arabic, Persian, or Turkish phonetics and vocabulary

This is not just a technical gap — it’s a data and prioritization issue.

Key Problems:

  • ❌ No clear support for major dialects (Kurmanji, Sorani)
  • ❌ Incorrect pronunciation due to training overlap with neighboring languages
  • ❌ Lack of proper datasets and linguistic handling
  • ❌ No official roadmap or acknowledgment from model providers

Why This Matters:

Kurdish has historically been underrepresented and often misclassified online.
AI systems are now reinforcing that gap instead of correcting it.

What We Need:

  • ✅ Explicit Kurdish language support (separate from Arabic/Persian/Turkish)
  • ✅ Dialect-aware TTS (at minimum: Kurmanji & Sorani)
  • ✅ Proper phoneme modeling and pronunciation
  • ✅ Collaboration with native speakers for dataset creation

Opportunity (for Qwen / others):

This is a high-leverage, low-competition win.
Supporting Kurdish properly would:

  • Unlock millions of new users
  • Build strong community loyalty
  • Position your model as globally inclusive — not just mainstream-language optimized

Right now, it feels like Kurdish is invisible in AI.
It shouldn’t be.

Let’s fix that.

Sign up or log in to comment