Upload 3 files
Browse files# Arabic End-of-Utterance (EOU) Classifier
## Overview
This repository contains a custom PyTorch model for **End-of-Utterance (EOU) detection** in Arabic conversational text.
The model predicts whether a given text segment represents the end of a speaker’s turn.
This is a **custom architecture** (not a Hugging Face `AutoModel`) and is intended for research and development use.
---
## Task
Given an input text segment, the model outputs a binary prediction:
- `0` → The speaker is expected to continue speaking
- `1` → The speaker has finished their turn
---
## Model Details
- Framework: PyTorch
- Architecture: Custom `EOUClassifier`
- Task: Binary classification (EOU detection)
- Language: Arabic
---
## Tokenizer
This model uses the tokenizer from:
`Omartificial-Intelligence-Space/SA-BERT-V1`
The tokenizer is **not included** in this repository and must be loaded separately.
---
## Files
- `model.py` — Model architecture (`EOUClassifier`)
- `model.pt` — Trained model weights
- `config.json` — Model configuration
- `README.md` — This file
---
## Loading the Model
```python
import torch
from transformers import AutoTokenizer
from model import EOUClassifier
tokenizer = AutoTokenizer.from_pretrained(
"Omartificial-Intelligence-Space/SA-BERT-V1"
)
model = EOUClassifier()
model.load_state_dict(
torch.load("model.pt", map_location="cpu")
)
model.eval()
examples = ["مقصدي من الموضوع انه", "اتمنى تقدر تساعدني"]
batch = tokenizer(examples, padding=True, truncation=True, return_tensors="pt")
batch.to(device)
out = model(batch["input_ids"], batch["attention_mask"])
```
##Intended Use
- End-of-turn detection
- Streaming conversational agents
- Dialogue systems
- Real-time response timing control
Notes
- This model requires the architecture code (model.py) to run.
- The architecture used at inference must exactly match the one used during training.
##License
MIT License
- config.json +8 -0
- model.pt +3 -0
- model.py +46 -0
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_name": "EOUClassifier",
|
| 3 |
+
"task": "end_of_utterance_detection",
|
| 4 |
+
"num_labels": 2,
|
| 5 |
+
"language": "ar",
|
| 6 |
+
"base_tokenizer": "Omartificial-Intelligence-Space/SA-BERT-V1",
|
| 7 |
+
"framework": "pytorch"
|
| 8 |
+
}
|
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b0cc3db32f144dbe5183a5c1b071fd0f09530a42ae1c3ef5874a288c177b4488
|
| 3 |
+
size 652634989
|
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
from transformers import AutoModel
|
| 4 |
+
|
| 5 |
+
MODEL_ID = "Omartificial-Intelligence-Space/SA-BERT-V1"
|
| 6 |
+
|
| 7 |
+
class EOUClassifier(nn.Module):
|
| 8 |
+
def __init__(self, model_id=MODEL_ID, num_labels=2, use_class_weights=True, pooling="cls"):
|
| 9 |
+
super().__init__()
|
| 10 |
+
self.num_labels = num_labels
|
| 11 |
+
self.pooling = pooling # "cls" or "mean"
|
| 12 |
+
|
| 13 |
+
# Load encoder
|
| 14 |
+
self.bert = AutoModel.from_pretrained(model_id)
|
| 15 |
+
|
| 16 |
+
self.dropout = nn.Dropout(0.15)
|
| 17 |
+
self.layer_1 = nn.Linear(768, 384)
|
| 18 |
+
self.act = nn.GELU()
|
| 19 |
+
self.layer_2 = nn.Linear(384, num_labels)
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
self.loss_fn = nn.CrossEntropyLoss()
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def forward(self, input_ids, attention_mask, labels=None):
|
| 26 |
+
outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
|
| 27 |
+
|
| 28 |
+
if self.pooling == "cls":
|
| 29 |
+
pooled = outputs.last_hidden_state[:, 0] # [CLS]
|
| 30 |
+
else:
|
| 31 |
+
# Mean pooling
|
| 32 |
+
hidden = outputs.last_hidden_state
|
| 33 |
+
mask = attention_mask.unsqueeze(-1)
|
| 34 |
+
pooled = (hidden * mask).sum(dim=1) / mask.sum(dim=1)
|
| 35 |
+
|
| 36 |
+
x = self.dropout(pooled)
|
| 37 |
+
x = self.layer_1(x)
|
| 38 |
+
x = self.act(x)
|
| 39 |
+
x = self.dropout(x)
|
| 40 |
+
logits = self.layer_2(x)
|
| 41 |
+
|
| 42 |
+
if labels is not None:
|
| 43 |
+
loss = self.loss_fn(logits, labels)
|
| 44 |
+
return {"loss": loss, "logits": logits}
|
| 45 |
+
|
| 46 |
+
return {"logits": logits}
|