YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Bangla MixLoRA — Phase B (Instruction Tuned)

Model Description

First MoE-style instruction-tuned LLM for Bangla using MixLoRA parameter-efficient fine-tuning.

Training Details

  • Base model: mistralai/Mistral-7B-v0.1
  • Phase A: MixLoRA CPT on 175K Bangla samples (~82M tokens)
  • Phase B: SFT on 80K Bangla instruction pairs
  • Framework: MoE-PEFT v2.0.2
  • Hardware: AMD MI300X VF (205.8GB VRAM)
  • Precision: bf16
  • Final loss: 0.1721201241016388
  • Date: 2026-03-22

Architecture

  • Base: Mistral-7B frozen
  • Front layers (1-8): Single LoRA (rank 16)
  • Middle layers (9-24): MixLoRA — 4 experts, top-2 routing
  • Back layers (25-32): Single LoRA (rank 16)
  • Trainable params: 120,586,240 (1.66%)

Benchmark Results

BanglaMMLU

  • accuracy: 0.2969
  • correct: 38
  • total: 128
  • skipped: 72

Indic-Squad-QA

  • exact_match: 0.302
  • correct: 29
  • total: 96

Dataset

  • Pretraining: sahilfarib/bangla-pretraining-corpus-clean (700K samples, ~2B tokens)
  • SFT: sahilfarib/bangla-mixlora-real/bangla_sft_80k.jsonl (80K pairs)

Usage

from mixlora import MixLoraModelForCausalLM
from transformers import AutoTokenizer
import torch

model, config = MixLoraModelForCausalLM.from_pretrained(
    "sahilfarib/bangla-mixlora-sft",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")

prompt = "বাংলাদেশের রাজধানী কোথায়?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@misc{bangla-mixlora-2026,
  title={BanglaMixLoRA: Parameter-Efficient MoE Adaptation for Bangla},
  author={Sahil Farib},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support