--- base_model: - Qwen/Qwen3-4B tags: - distillation - distilled - sft - peft - qwen3 datasets: - ianncity/KIMI-K2.5-550000x - Jackrong/Qwen3.5-reasoning-700x - nohurry/Opus-4.6-Reasoning-3000x-filtered - TeichAI/claude-4.5-opus-high-reasoning-250x - TeichAI/gemini-3-pro-preview-high-reasoning-250x - TeichAI/claude-haiku-4.5-high-reasoning-1700x - TeichAI/gpt-5.2-high-reasoning-250x - Roman1111111/gemini-3.1-pro-hard-high-reasoning - Jackrong/glm-4.7-multiturn-CoT - bmeyer2025/glm5-reasoning-traces - TeichAI/claude-sonnet-4.5-high-reasoning-250x - TeichAI/deepseek-v3.2-speciale-openr1-math-3k - TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k - TeichAI/deepseek-v3.2-speciale-1000x - TeichAI/gpt-5-codex-1000x model-index: - name: hadadxyz/Qwen3-4B-Diversity results: - task: type: text-generation name: Text Generation dataset: name: Mmlu type: cais/mmlu metrics: - type: acc value: 67.8 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Humanities type: cais/mmlu metrics: - type: acc value: 57.9 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Formal Logic type: cais/mmlu metrics: - type: acc value: 58.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School European History type: cais/mmlu metrics: - type: acc value: 78.2 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Us History type: cais/mmlu metrics: - type: acc value: 84.8 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School World History type: cais/mmlu metrics: - type: acc value: 83.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu International Law type: cais/mmlu metrics: - type: acc value: 77.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Jurisprudence type: cais/mmlu metrics: - type: acc value: 78.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Logical Fallacies type: cais/mmlu metrics: - type: acc value: 82.8 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Moral Disputes type: cais/mmlu metrics: - type: acc value: 71.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Moral Scenarios type: cais/mmlu metrics: - type: acc value: 28.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Philosophy type: cais/mmlu metrics: - type: acc value: 73.3 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Prehistory type: cais/mmlu metrics: - type: acc value: 76.2 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Professional Law type: cais/mmlu metrics: - type: acc value: 47.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu World Religions type: cais/mmlu metrics: - type: acc value: 78.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Other type: cais/mmlu metrics: - type: acc value: 72.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Business Ethics type: cais/mmlu metrics: - type: acc value: 73.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Clinical Knowledge type: cais/mmlu metrics: - type: acc value: 75.5 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Medicine type: cais/mmlu metrics: - type: acc value: 71.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Global Facts type: cais/mmlu metrics: - type: acc value: 41.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Human Aging type: cais/mmlu metrics: - type: acc value: 67.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Management type: cais/mmlu metrics: - type: acc value: 84.5 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Marketing type: cais/mmlu metrics: - type: acc value: 85.5 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Medical Genetics type: cais/mmlu metrics: - type: acc value: 75.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Miscellaneous type: cais/mmlu metrics: - type: acc value: 79.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Nutrition type: cais/mmlu metrics: - type: acc value: 74.8 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Professional Accounting type: cais/mmlu metrics: - type: acc value: 55.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Professional Medicine type: cais/mmlu metrics: - type: acc value: 71.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Virology type: cais/mmlu metrics: - type: acc value: 53.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Social Sciences type: cais/mmlu metrics: - type: acc value: 78.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Econometrics type: cais/mmlu metrics: - type: acc value: 64.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Geography type: cais/mmlu metrics: - type: acc value: 84.3 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Government And Politics type: cais/mmlu metrics: - type: acc value: 87.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Macroeconomics type: cais/mmlu metrics: - type: acc value: 74.6 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Microeconomics type: cais/mmlu metrics: - type: acc value: 80.7 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Psychology type: cais/mmlu metrics: - type: acc value: 87.2 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Human Sexuality type: cais/mmlu metrics: - type: acc value: 75.6 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Professional Psychology type: cais/mmlu metrics: - type: acc value: 71.2 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Public Relations type: cais/mmlu metrics: - type: acc value: 71.8 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Security Studies type: cais/mmlu metrics: - type: acc value: 74.3 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Sociology type: cais/mmlu metrics: - type: acc value: 84.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Us Foreign Policy type: cais/mmlu metrics: - type: acc value: 81.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Stem type: cais/mmlu metrics: - type: acc value: 68.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Abstract Algebra type: cais/mmlu metrics: - type: acc value: 45.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Anatomy type: cais/mmlu metrics: - type: acc value: 61.5 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Astronomy type: cais/mmlu metrics: - type: acc value: 78.9 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Biology type: cais/mmlu metrics: - type: acc value: 83.3 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Chemistry type: cais/mmlu metrics: - type: acc value: 54.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Computer Science type: cais/mmlu metrics: - type: acc value: 69.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Mathematics type: cais/mmlu metrics: - type: acc value: 58.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu College Physics type: cais/mmlu metrics: - type: acc value: 53.9 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Computer Security type: cais/mmlu metrics: - type: acc value: 80.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Conceptual Physics type: cais/mmlu metrics: - type: acc value: 77.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Electrical Engineering type: cais/mmlu metrics: - type: acc value: 76.6 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Elementary Mathematics type: cais/mmlu metrics: - type: acc value: 65.6 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Biology type: cais/mmlu metrics: - type: acc value: 86.1 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Chemistry type: cais/mmlu metrics: - type: acc value: 70.4 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Computer Science type: cais/mmlu metrics: - type: acc value: 86.0 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Mathematics type: cais/mmlu metrics: - type: acc value: 42.6 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Physics type: cais/mmlu metrics: - type: acc value: 62.9 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu High School Statistics type: cais/mmlu metrics: - type: acc value: 71.3 name: accuracy - task: type: text-generation name: Text Generation dataset: name: Mmlu Machine Learning type: cais/mmlu metrics: - type: acc value: 57.1 name: accuracy pipeline_tag: text-generation library_name: transformers license: apache-2.0 license_link: https://huggingface.co/hadadxyz/Qwen3-4B-Diversity/blob/main/LICENSE --- # Introduction ![MMLU](evaluations/mmlu.png) Qwen3-4B-Diversity is a fine-tuned language model based on [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) that has been trained on a diverse collection of high-quality reasoning datasets. This model combines knowledge distilled from various state-of-the-art AI systems to provide enhanced reasoning capabilities across multiple domains including mathematics, coding, general problem-solving, and multi-turn conversations. ### Training Configuration The model was trained using supervised fine-tuning techniques with parameter-efficient methods to optimize performance while maintaining computational efficiency. Key training parameters include: | Parameter | Value | |------------------|--------| | Number of Epochs | 2 | | Context Length | 40,960 | ### Hardware and Resources | Resource | Specification | |-------------------|------------------------| | GPU | A100-80GB | | Training Duration | Approximately 17 hours | | Estimated Cost | $27 to $30 | ### Training Data | Dataset | Rows Used | Model | |--------------------------------------------------------------------------------------------------------------------------------------------|------------|------------------------------------| | [ianncity/KIMI-K2.5-550000x](https://huggingface.co/datasets/ianncity/KIMI-K2.5-550000x) (General-Distillation) | 1,000 | Kimi K2.5 | | [Jackrong/Qwen3.5-reasoning-700x](https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x) | 633 | Qwen3.5 | | [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | 2,326 | Claude Opus 4.6 | | [TeichAI/claude-4.5-opus-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x) | 250 | Claude Opus 4.5 | | [TeichAI/gemini-3-pro-preview-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/gemini-3-pro-preview-high-reasoning-250x) | 248 | Gemini 3 Pro | | [TeichAI/claude-haiku-4.5-high-reasoning-1700x](https://huggingface.co/datasets/TeichAI/claude-haiku-4.5-high-reasoning-1700x) | 1,688 | Claude Haiku 4.5 | | [TeichAI/gpt-5.2-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/gpt-5.2-high-reasoning-250x) | 249 | GPT-5.2 | | [Roman1111111/gemini-3.1-pro-hard-high-reasoning](https://huggingface.co/datasets/Roman1111111/gemini-3.1-pro-hard-high-reasoning) | 3,150 | Gemini 3.1 Pro | | [Jackrong/glm-4.7-multiturn-CoT](https://huggingface.co/datasets/Jackrong/glm-4.7-multiturn-CoT) | 5,090 | GLM-4.7 | | [bmeyer2025/glm5-reasoning-traces](https://huggingface.co/datasets/bmeyer2025/glm5-reasoning-traces) | 1,744 | GLM-5 | | [TeichAI/claude-sonnet-4.5-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/claude-sonnet-4.5-high-reasoning-250x) | 247 | Claude Sonnet 4.5 | | [TeichAI/deepseek-v3.2-speciale-openr1-math-3k](https://huggingface.co/datasets/TeichAI/deepseek-v3.2-speciale-openr1-math-3k) | 3,317 | DeepSeek V3.2-Speciale | | [TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k](https://huggingface.co/datasets/TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k) | 2,953 | DeepSeek V3.2-Speciale | | [TeichAI/deepseek-v3.2-speciale-1000x](https://huggingface.co/datasets/TeichAI/deepseek-v3.2-speciale-1000x) | 991 | DeepSeek V3.2-Speciale | | [TeichAI/gpt-5-codex-1000x](https://huggingface.co/datasets/TeichAI/gpt-5-codex-1000x) | 991 | GPT-5 Codex | | **Total** | **24,877** | Combined diverse reasoning dataset | ## Model Capabilities This model excels in several key areas: 1. **Advanced Reasoning**: The model can break down complex problems into steps and provide detailed reasoning processes. 2. **Mathematical Problem Solving**: Enhanced capabilities for mathematical reasoning and problem-solving through dedicated math-focused datasets. 3. **Code Generation and Understanding**: Improved coding abilities from multiple code-reasoning datasets including DeepSeek and GPT-5 Codex data. 4. **Multi-Turn Conversations**: Better handling of extended dialogues and context-aware responses. 5. **Domain Versatility**: Exposure to reasoning patterns from various AI systems provides flexibility across different domains and task types. ## Usage ### Quick Demo If you are looking for a quick demo that is completely free and without any cost, you can use [Google Colab](https://colab.research.google.com/drive/1qy1n9MigDuwT0cA1Y6ImHChAIlsZPIcC). ### Ollama (Local) ```bash # https://ollama.com/hadad/qwen3-4bd # hadad/qwen3-4bd:Q8_0 | 4.3GB # hadad/qwen3-4bd:BF16 | 8.1GB # ollama pull hadad/qwen3-4bd:Q8_0 ollama run hadad/qwen3-4bd:Q8_0 ``` If you are using Ollama and are interested in **tools** or **function calling**, it is recommended to use the **OpenAI-compatible API** provided by Ollama. This approach is more powerful. Refer to the [Ollama documentation](https://docs.ollama.com/api/openai-compatibility). ### Python (Local) ```bash #pip install transformers==4.56.2 ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "hadadxyz/Qwen3-4B-Diversity" # load the tokenizer and the model tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) # prepare the model input prompt = "Give me a short introduction to large language model." messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=True # Switches between thinking and non-thinking modes. Default is True. ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) # conduct text completion generated_ids = model.generate( **model_inputs, max_new_tokens=32768 ) output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() # parsing thinking content try: # rindex finding 151668 () index = len(output_ids) - output_ids[::-1].index(151668) except ValueError: index = 0 thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n") content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n") print("thinking content:", thinking_content) print("content:", content) ``` ## Inference Parameters For optimal results, we recommend the following generation parameters: ### Thinking | Parameter | Recommended Value | Description | |-----------------|-------------------|------------------------------------------| | temperature | 0.6 | Controls randomness in generation | | top_p | 0.95 | Nucleus sampling threshold | | top_k | 20 | Top-k sampling parameter | | min_p | 0 | Minimum probability threshold | ### Non-Thinking | Parameter | Recommended Value | Description | |-----------------|-------------------|------------------------------------------| | temperature | 0.7 | Controls randomness in generation | | top_p | 0.8 | Nucleus sampling threshold | | top_k | 20 | Top-k sampling parameter | | min_p | 0 | Minimum probability threshold | ## Citation If you use this model in your research or applications, please cite both this model and the base model: ```bibtex @misc{qwen3-4b-diversity, author = {hadadxyz}, title = {Qwen3-4B-Diversity}, year = {2026}, url = {https://huggingface.co/hadadxyz/Qwen3-4B-Diversity} } ``` ## Acknowledgments This model was made possible through the combination of multiple high-quality datasets from the community. We acknowledge and thank all dataset creators and the Qwen team for providing the excellent base model.