TinyModel1
TinyModel1 is a compact encoder model for news topic classification, trained from scratch on the AG News dataset. It targets fast CPU/GPU inference, simple deployment behind a router or API, and use as a baseline before larger or domain-specific models.
Links
- Source code (train & export): https://github.com/HyperlinksSpace/TinyModel
- Live demo (Space): TinyModel1Space (canonical Hub URL; avoids unreliable
*.hf.spacelinks)
Model summary
| Task | Text classification (single-label, 4 classes) |
| Labels | World, Sports, Business, Sci/Tech |
| Architecture | Tiny BERT-style encoder (BertForSequenceClassification) |
| Parameters | 1,339,268 (~1.34M) |
| Max sequence length | 128 tokens (training & inference) |
| Framework | Transformers · Safetensors |
Model overview
This release fits a small footprint so you can run batch or interactive classification without heavy GPUs. Training uses a WordPiece tokenizer fit on the training split and a shallow BERT stack suited to short news sentences.
Core capabilities
- Topic routing — assign one of four coarse news categories for search, feeds, or moderation triage.
- Low latency — small parameter count keeps inference suitable for edge and serverless setups.
- Fine-tuning base — swap labels or add data for your domain while keeping the same architecture.
Training
| Setting | Value |
|---|---|
| Train samples | 3000 |
| Eval samples | 600 |
| Epochs | 2 |
| Batch size | 16 |
| Learning rate | 0.0001 |
| Optimizer | AdamW |
Evaluation
| Metric | Value |
|---|---|
| Eval accuracy | 0.5150 |
| Final train loss | 1.1568 |
Metrics are computed on the held-out eval split configured above; treat them as a sanity-check baseline, not a production SLA.
Getting started
Inference with transformers
from transformers import pipeline
clf = pipeline(
"text-classification",
model="TinyModel1", # or local path after save
tokenizer="TinyModel1",
top_k=None,
)
text = "Markets rose after the central bank held rates steady."
print(clf(text))
Use top_k=None (or your Transformers version’s equivalent) to obtain scores for all labels. Replace "TinyModel1" with your Hugging Face model id (for example HyperlinksSpace/TinyModel1) when loading from the Hub.
Training data
- Dataset: fancyzhx/ag_news (4-class news topics).
- Preprocessing: tokenizer trained on training texts; sequences truncated to 128 tokens.
Intended use
- Prototyping routing, tagging, and dashboard features over English news-style text.
- Teaching and benchmarking small-classification setups.
- Starting point for domain adaptation (finance, sports, etc.) with your own labels.
Limitations
- Accuracy is modest by design; do not rely on it for high-stakes decisions without validation on your data.
- English-oriented news wording; other languages or social-style text may degrade.
- Four fixed classes; not suitable as a general-purpose language model.
License
This model is released under the Apache 2.0 license (see repository LICENSE where applicable).
- Downloads last month
- 16