gijl's picture
Upload README.md with huggingface_hub
c2f7d68 verified
metadata
language:
  - ar
  - en
tags:
  - medical
  - moe
  - reasoning
  - bilingual
license: apache-2.0
library_name: transformers
metrics:
  - accuracy

🩺 Bilingual Medical Reasoning MoE

A specialized 1.5B Mixture-of-Experts (MoE) Transformer model optimized for Arabic-English clinical reasoning and medical decision support.

πŸ—οΈ Model Architecture

  • Parameters: 1.5B (Total), ~68M (Active per token).
  • Structure: 6 layers, 8 heads per layer, Grouped-Query Attention (GQA).
  • MoE System: 4 experts per FFN layer with Top-2 active routing.
  • Reasoning: Native support for Chain-of-Thought (CoT) using <|think|> tags.

πŸš€ Usage

This model is designed to be used with the custom DeepThinkingModel architecture defined in this repository.

from model import DeepThinkingModel
import torch

model = DeepThinkingModel.from_pretrained("gijl/Bilingual-Medical-Reasoning-MoE")

πŸ“Š Training Data

  • AceGPT-Instruction (Specialized Arabic instructions)
  • Helsinki-NLP OPUS-100 (Bilingual translation & reasoning)
  • Oasst1 (Conversational grounding)