# 🤖 GLMMC: Generalist and Lightweight Model for Multilabel Classification 

GLMMC is a Multilabel Classification Model capable of classifying texts into various predefined entities using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to Large Language Models (LLMs), which, despite their flexibility, are costly and too large for resource-constrained scenarios.

### Usage
```python

from model import BiEncoderModel

texts = ["A celebrity chef has opened a new restaurant specializing in vegan cuisine.",
         "Doctors are warning about the rise in flu cases this season.",
         "The United States has announced plans to build a wall on its border with Mexico."]
batch_labels = [

        ["Food", "Business", "Politics"],
        ["Health", "Food", "Public Health"],
        ["Immigration", "Religion", "National Security"]
    ]
# Load the model
model = BiEncoderModel("sabdou/bi-encoder-model", max_num_labels=6)
# Prediction with JSON output
predictions = model.forward_predict(texts, batch_labels)
print("Predictions:", predictions)

```


#### Expected Output

```
Predictions: [
{'text': 'A celebrity chef has opened a new restaurant specializing in vegan cuisine.', 'scores': {'Food': 1.0, 'Business': 1.0, 'Politics': 0.0}},
{'text': 'Doctors are warning about the rise in flu cases this season.', 'scores': {'Health': 1.0, 'Food': 0.0, 'Public Health': 1.0}},
{'text': 'The United States has announced plans to build a wall on its border with Mexico.', 'scores': {'Immigration': 1.0, 'Religion': 0.0, 'National Security': 1.0}
]

```

#### Data

Synthetic data generated with gpt4-mini and gemini 

## Author 🧑‍💻
- [Salim ABDOU DAOURA](https://github.com/sabdoudaoura)