Upload README.md with huggingface_hub

c2f7d68 verified 13 days ago

1.11 kB

language:
  - ar
  - en
tags:
  - medical
  - moe
  - reasoning
  - bilingual
license: apache-2.0
library_name: transformers
metrics:
  - accuracy

🩺 Bilingual Medical Reasoning MoE

A specialized 1.5B Mixture-of-Experts (MoE) Transformer model optimized for Arabic-English clinical reasoning and medical decision support.

Parameters: 1.5B (Total), ~68M (Active per token).
Structure: 6 layers, 8 heads per layer, Grouped-Query Attention (GQA).
MoE System: 4 experts per FFN layer with Top-2 active routing.
Reasoning: Native support for Chain-of-Thought (CoT) using <|think|> tags.

This model is designed to be used with the custom DeepThinkingModel architecture defined in this repository.

from model import DeepThinkingModel
import torch

model = DeepThinkingModel.from_pretrained("gijl/Bilingual-Medical-Reasoning-MoE")