shaqaqio commited on
Commit
91a459e
·
verified ·
1 Parent(s): a18d900

Upload 3 files

Browse files

# Arabic End-of-Utterance (EOU) Classifier

## Overview
This repository contains a custom PyTorch model for **End-of-Utterance (EOU) detection** in Arabic conversational text.
The model predicts whether a given text segment represents the end of a speaker’s turn.

This is a **custom architecture** (not a Hugging Face `AutoModel`) and is intended for research and development use.

---

## Task
Given an input text segment, the model outputs a binary prediction:

- `0` → The speaker is expected to continue speaking
- `1` → The speaker has finished their turn

---

## Model Details
- Framework: PyTorch
- Architecture: Custom `EOUClassifier`
- Task: Binary classification (EOU detection)
- Language: Arabic

---

## Tokenizer
This model uses the tokenizer from:

`Omartificial-Intelligence-Space/SA-BERT-V1`

The tokenizer is **not included** in this repository and must be loaded separately.

---

## Files
- `model.py` — Model architecture (`EOUClassifier`)
- `model.pt` — Trained model weights
- `config.json` — Model configuration
- `README.md` — This file

---

## Loading the Model
```python
import torch
from transformers import AutoTokenizer
from model import EOUClassifier

tokenizer = AutoTokenizer.from_pretrained(
"Omartificial-Intelligence-Space/SA-BERT-V1"
)

model = EOUClassifier()
model.load_state_dict(
torch.load("model.pt", map_location="cpu")
)
model.eval()

examples = ["مقصدي من الموضوع انه", "اتمنى تقدر تساعدني"]


batch = tokenizer(examples, padding=True, truncation=True, return_tensors="pt")
batch.to(device)

out = model(batch["input_ids"], batch["attention_mask"])
```

##Intended Use

- End-of-turn detection

- Streaming conversational agents

- Dialogue systems

- Real-time response timing control

Notes

- This model requires the architecture code (model.py) to run.

- The architecture used at inference must exactly match the one used during training.

##License

MIT License

Files changed (3) hide show
  1. config.json +8 -0
  2. model.pt +3 -0
  3. model.py +46 -0
config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "EOUClassifier",
3
+ "task": "end_of_utterance_detection",
4
+ "num_labels": 2,
5
+ "language": "ar",
6
+ "base_tokenizer": "Omartificial-Intelligence-Space/SA-BERT-V1",
7
+ "framework": "pytorch"
8
+ }
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0cc3db32f144dbe5183a5c1b071fd0f09530a42ae1c3ef5874a288c177b4488
3
+ size 652634989
model.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from transformers import AutoModel
4
+
5
+ MODEL_ID = "Omartificial-Intelligence-Space/SA-BERT-V1"
6
+
7
+ class EOUClassifier(nn.Module):
8
+ def __init__(self, model_id=MODEL_ID, num_labels=2, use_class_weights=True, pooling="cls"):
9
+ super().__init__()
10
+ self.num_labels = num_labels
11
+ self.pooling = pooling # "cls" or "mean"
12
+
13
+ # Load encoder
14
+ self.bert = AutoModel.from_pretrained(model_id)
15
+
16
+ self.dropout = nn.Dropout(0.15)
17
+ self.layer_1 = nn.Linear(768, 384)
18
+ self.act = nn.GELU()
19
+ self.layer_2 = nn.Linear(384, num_labels)
20
+
21
+
22
+ self.loss_fn = nn.CrossEntropyLoss()
23
+
24
+
25
+ def forward(self, input_ids, attention_mask, labels=None):
26
+ outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
27
+
28
+ if self.pooling == "cls":
29
+ pooled = outputs.last_hidden_state[:, 0] # [CLS]
30
+ else:
31
+ # Mean pooling
32
+ hidden = outputs.last_hidden_state
33
+ mask = attention_mask.unsqueeze(-1)
34
+ pooled = (hidden * mask).sum(dim=1) / mask.sum(dim=1)
35
+
36
+ x = self.dropout(pooled)
37
+ x = self.layer_1(x)
38
+ x = self.act(x)
39
+ x = self.dropout(x)
40
+ logits = self.layer_2(x)
41
+
42
+ if labels is not None:
43
+ loss = self.loss_fn(logits, labels)
44
+ return {"loss": loss, "logits": logits}
45
+
46
+ return {"logits": logits}