Instructions to use gijl/Bilingual-Medical-Reasoning-MoE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gijl/Bilingual-Medical-Reasoning-MoE with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("gijl/Bilingual-Medical-Reasoning-MoE", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - ar | |
| - en | |
| tags: | |
| - medical | |
| - moe | |
| - reasoning | |
| - bilingual | |
| license: apache-2.0 | |
| library_name: transformers | |
| metrics: | |
| - accuracy | |
| # ๐ฉบ Bilingual Medical Reasoning MoE | |
| A specialized **1.5B Mixture-of-Experts (MoE)** Transformer model optimized for Arabic-English clinical reasoning and medical decision support. | |
| ### ๐๏ธ Model Architecture | |
| - **Parameters:** 1.5B (Total), ~68M (Active per token). | |
| - **Structure:** 6 layers, 8 heads per layer, Grouped-Query Attention (GQA). | |
| - **MoE System:** 4 experts per FFN layer with Top-2 active routing. | |
| - **Reasoning:** Native support for Chain-of-Thought (CoT) using `<|think|>` tags. | |
| ### ๐ Usage | |
| This model is designed to be used with the custom `DeepThinkingModel` architecture defined in this repository. | |
| ```python | |
| from model import DeepThinkingModel | |
| import torch | |
| model = DeepThinkingModel.from_pretrained("gijl/Bilingual-Medical-Reasoning-MoE") | |
| ``` | |
| ### ๐ Training Data | |
| - **AceGPT-Instruction** (Specialized Arabic instructions) | |
| - **Helsinki-NLP OPUS-100** (Bilingual translation & reasoning) | |
| - **Oasst1** (Conversational grounding) | |