---
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
datasets:
  - ag_news
language:
  - en
tags:
  - tiny
  - bert
  - text-classification
  - ag-news
---

<div align="center">
  <img src="TinyModel1Image.png" alt="TinyModel1" style="max-width: 100%; width: 100%; height: auto; display: block;" />
</div>

# TinyModel1

**TinyModel1** is a compact **encoder** model for **news topic classification**, trained from scratch on the [AG News](https://huggingface.co/datasets/fancyzhx/ag_news) dataset. It targets fast CPU/GPU inference, simple deployment behind a router or API, and use as a **baseline** before larger or domain-specific models.

## Links

- **Source code (train & export):** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
- **Live demo (Space):** [TinyModel1Space](https://huggingface.co/spaces/HyperlinksSpace/TinyModel1Space) (canonical Hub URL; avoids unreliable `*.hf.space` links)


---

## Model summary

| Field | Value |
|:--|:--|
| **Task** | Text classification (single-label, 4 classes) |
| **Labels** | World, Sports, Business, Sci/Tech |
| **Architecture** | Tiny BERT-style encoder (`BertForSequenceClassification`) |
| **Parameters** | 1,339,268 (~1.34M) |
| **Max sequence length** | 128 tokens (training & inference) |
| **Framework** | [Transformers](https://github.com/huggingface/transformers) · Safetensors |

---

## Model overview

This release fits a **small footprint** so you can run batch or interactive classification without heavy GPUs. Training uses a WordPiece tokenizer fit on the training split and a shallow BERT stack suited to short news sentences.

### **Core capabilities**

- **Topic routing** — assign one of four coarse news categories for search, feeds, or moderation triage.
- **Low latency** — small parameter count keeps inference suitable for edge and serverless setups.
- **Fine-tuning base** — swap labels or add data for your domain while keeping the same architecture.

---

## Training

| Setting | Value |
|:--|:--|
| **Train samples** | 3000 |
| **Eval samples** | 600 |
| **Epochs** | 2 |
| **Batch size** | 16 |
| **Learning rate** | 0.0001 |
| **Optimizer** | AdamW |

---

## Evaluation

| Metric | Value |
|:--|:--|
| **Eval accuracy** | 0.5583 |
| **Final train loss** | 1.1562 |

Metrics are computed on the held-out eval split configured above; treat them as a **sanity-check baseline**, not a production SLA.

---

## Getting started

### Inference with `transformers`

```python
from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="TinyModel1",  # or local path after save
    tokenizer="TinyModel1",
    top_k=None,
)
text = "Markets rose after the central bank held rates steady."
print(clf(text))
```

Use `top_k=None` (or your Transformers version’s equivalent) to obtain scores for **all** labels. Replace `"TinyModel1"` with your Hugging Face model id (for example `HyperlinksSpace/TinyModel1`) when loading from the Hub.

---

## Training data

- **Dataset:** [fancyzhx/ag_news](https://huggingface.co/datasets/fancyzhx/ag_news) (4-class news topics).
- **Preprocessing:** tokenizer trained on training texts; sequences truncated to 128 tokens.

---

## Intended use

- Prototyping **routing**, **tagging**, and **dashboard** features over English news-style text.
- Teaching and benchmarking small-classification setups.
- Starting point for **domain adaptation** (finance, sports, etc.) with your own labels.

---

## Limitations

- **Accuracy** is modest by design; do not rely on it for high-stakes decisions without validation on your data.
- **English-oriented** news wording; other languages or social-style text may degrade.
- **Four fixed classes**; not suitable as a general-purpose language model.

---

## License

This model is released under the **Apache 2.0** license (see repository `LICENSE` where applicable).