TinyModel1

TinyModel1

TinyModel1 is a compact encoder model for news topic classification, trained from scratch on the AG News dataset. It targets fast CPU/GPU inference, simple deployment behind a router or API, and use as a baseline before larger or domain-specific models.

Links


Model summary

Task Text classification (single-label, 4 classes)
Labels World, Sports, Business, Sci/Tech
Architecture Tiny BERT-style encoder (BertForSequenceClassification)
Parameters 1,339,268 (~1.34M)
Max sequence length 128 tokens (training & inference)
Framework Transformers · Safetensors

Model overview

This release fits a small footprint so you can run batch or interactive classification without heavy GPUs. Training uses a WordPiece tokenizer fit on the training split and a shallow BERT stack suited to short news sentences.

Core capabilities

  • Topic routing — assign one of four coarse news categories for search, feeds, or moderation triage.
  • Low latency — small parameter count keeps inference suitable for edge and serverless setups.
  • Fine-tuning base — swap labels or add data for your domain while keeping the same architecture.

Training

Setting Value
Train samples 3000
Eval samples 600
Epochs 2
Batch size 16
Learning rate 0.0001
Optimizer AdamW

Evaluation

Metric Value
Eval accuracy 0.5150
Final train loss 1.1568

Metrics are computed on the held-out eval split configured above; treat them as a sanity-check baseline, not a production SLA.


Getting started

Inference with transformers

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="TinyModel1",  # or local path after save
    tokenizer="TinyModel1",
    top_k=None,
)
text = "Markets rose after the central bank held rates steady."
print(clf(text))

Use top_k=None (or your Transformers version’s equivalent) to obtain scores for all labels. Replace "TinyModel1" with your Hugging Face model id (for example HyperlinksSpace/TinyModel1) when loading from the Hub.


Training data

  • Dataset: fancyzhx/ag_news (4-class news topics).
  • Preprocessing: tokenizer trained on training texts; sequences truncated to 128 tokens.

Intended use

  • Prototyping routing, tagging, and dashboard features over English news-style text.
  • Teaching and benchmarking small-classification setups.
  • Starting point for domain adaptation (finance, sports, etc.) with your own labels.

Limitations

  • Accuracy is modest by design; do not rely on it for high-stakes decisions without validation on your data.
  • English-oriented news wording; other languages or social-style text may degrade.
  • Four fixed classes; not suitable as a general-purpose language model.

License

This model is released under the Apache 2.0 license (see repository LICENSE where applicable).

Downloads last month
16
Safetensors
Model size
1.34M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train HyperlinksSpace/TinyModel1

Space using HyperlinksSpace/TinyModel1 1