--- pipeline_tag: text-classification tags: - sentiment-analysis - transformer - custom - pytorch - trained-from-scratch datasets: - stanfordnlp/imdb - stanfordnlp/sentiment140 - SetFit/sst5 - financial_phrasebank - tweet_eval language: - en license: mit --- # Sentiment Transformer — tango A small (≈13M parameter) transformer encoder trained **entirely from scratch** for 3-class sentiment analysis (negative / neutral / positive). ## Architecture Pre-layer-norm transformer encoder with [CLS] pooling and a linear classification head. Built with pure `torch.nn` — no pretrained weights. | Parameter | Value | |---|---| | Hidden dim | 256 | | FFN dim | 1024 | | Layers | 6 | | Heads | 8 | | Max seq len | 256 | | Vocab size | 16000 | | Labels | NEGATIVE, NEUTRAL, POSITIVE | | Precision | bf16 mixed-precision | ## Training Data Trained on a combined corpus of: - **IMDB** (50k movie reviews) - **Sentiment140** (1M tweets) - **Yelp** (1M reviews) - **SST-5** (fine-grained → 3-class) - **Financial PhraseBank** (finance headlines) - **TweetEval** (SemEval-2017 tweets) ## Usage ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model = AutoModelForSequenceClassification.from_pretrained( "Impulse2000/sentiment-transformer", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("Impulse2000/sentiment-transformer") pipe = pipeline("text-classification", model=model, tokenizer=tokenizer) print(pipe("This movie was absolutely fantastic!")) ```