gating-bert-adaptroute

A 4-class DistilBERT classifier acting as the gating network for AdaptRoute.

Labels

ID Label
0 code
1 math
2 qa
3 medical

Architecture

  • Base: distilbert-base-uncased (LoRA merged)
  • Training: 5 epochs, lr=0.0002
Downloads last month
79
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kunjcr2/gating-bert-adaptroute

Unable to build the model tree, the base model loops to the model itself. Learn more.