FIN-RoBERTa-Custom: Financial Sentiment Analysis
A fine-tuned RoBERTa model for financial sentiment analysis that classifies text into negative, neutral, or positive sentiment.
Model Description
This model is a fine-tuned version of roberta-base specifically trained for financial text sentiment classification. It achieves 99.6% accuracy and 99.5% F1 (macro) on the Financial PhraseBank (100% agreement) benchmark.
Labels
| Label | ID | Description |
|---|---|---|
| negative | 0 | Negative financial sentiment |
| neutral | 1 | Neutral financial sentiment |
| positive | 2 | Positive financial sentiment |
Usage
from transformers import RobertaForSequenceClassification, RobertaTokenizer
import torch
# Load model and tokenizer
model_name = "alasteirho/FIN-RoBERTa-Custom"
tokenizer = RobertaTokenizer.from_pretrained(model_name)
model = RobertaForSequenceClassification.from_pretrained(model_name)
model.eval()
# Predict sentiment
def predict_sentiment(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
pred = torch.argmax(probs, dim=1).item()
labels = ["negative", "neutral", "positive"]
return labels[pred], probs[0].tolist()
# Example
text = "The company reported strong quarterly earnings, beating analyst expectations."
sentiment, confidence = predict_sentiment(text)
print(f"Sentiment: {sentiment}")
print(f"Confidence: {confidence}")
Using Pipeline (Simple)
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="alasteirho/FIN-RoBERTa-Custom")
result = classifier("Revenue declined 15% year-over-year due to weak demand.")
print(result)
Training Data
The model was trained on a combined dataset from five financial sentiment sources:
| Dataset | Samples | Description |
|---|---|---|
| Financial PhraseBank (50% agreement) | 4,846 | Financial news sentences annotated by 16 domain experts |
| Twitter Financial News Sentiment | 9,543 | Financial tweets with sentiment labels |
| FiQA 2018 | 961 | Financial question-answer pairs with continuous sentiment scores |
| SemEval 2017 Task 5 Subtask 2 | 1,142 | Financial news headlines with continuous sentiment scores |
| SemEval 2017 Task 5 Subtask 1 | 1,700 | Financial microblogs (tweets and StockTwits) with continuous sentiment scores |
Datasets with continuous sentiment scores (FiQA, SemEval Subtask 1, and SemEval Subtask 2) were discretised into three classes using fixed thresholds (below -0.2 = negative, above +0.2 = positive, otherwise neutral).
Total training samples: 16,127 (after deduplication and cleaning) Validation samples: 3,774
Evaluation Results
Financial PhraseBank (100% Agreement) - Held-out Test Set (2,264 samples)
| Metric | Score |
|---|---|
| Accuracy | 0.9956 |
| F1 (Macro) | 0.9951 |
Per-Class Performance (FPB All Agree)
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Negative | 1.000 | 0.990 | 0.995 | 303 |
| Neutral | 0.995 | 0.998 | 0.996 | 1,391 |
| Positive | 0.995 | 0.993 | 0.994 | 570 |
Combined Validation Set (3,774 samples)
| Metric | Score |
|---|---|
| Accuracy | 0.8755 |
| F1 (Macro) | 0.8629 |
| F1 (Weighted) | 0.8746 |
Per-Class Performance (Combined Validation)
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Negative | 0.86 | 0.83 | 0.84 | 588 |
| Neutral | 0.88 | 0.81 | 0.84 | 1,171 |
| Positive | 0.88 | 0.93 | 0.90 | 2,015 |
Comparison with FinBERT (Combined Validation Set)
| Metric | FIN-RoBERTa-Custom | FinBERT | Delta |
|---|---|---|---|
| Accuracy | 0.8755 | 0.4290 | +0.4465 |
| F1 (Macro) | 0.8631 | 0.4672 | +0.3959 |
| F1 (Weighted) | 0.8746 | 0.4097 | +0.4648 |
Training Procedure
Hyperparameters
| Parameter | Value |
|---|---|
| Base Model | roberta-base |
| Max Epochs | 30 |
| Early Stopping Patience | 8 (on F1 macro) |
| Best Checkpoint | Epoch 12 |
| Training stopped at | Epoch 20 |
| Batch Size (per device) | 16 |
| Gradient Accumulation Steps | 2 |
| Effective Batch Size | 32 |
| Learning Rate | 2e-5 |
| LR Scheduler | Cosine with warmup |
| Warmup Ratio | 6% |
| Weight Decay | 0.01 |
| Max Sequence Length | 128 |
| Precision | FP16 (mixed precision) |
| Seed | 42 |
Training was performed on an NVIDIA GeForce RTX 5080 GPU (16 GB) using the Hugging Face Transformers library with PyTorch. The training run completed in approximately 14 minutes, consuming 3.35 GB of peak GPU memory.
Intended Use
This model is designed for:
- Financial news sentiment analysis
- Stock/market sentiment classification
- Financial social media sentiment analysis
- Earnings call/report sentiment extraction
Limitations
- Trained primarily on English financial text
- May not generalise well to non-financial domains
- Performance may vary on very long documents (max 128 tokens)
- Cryptocurrency and emerging market terminology may be underrepresented
Citation
If you use this model, please cite:
@misc{alasteir_ho_2026,
author = { Alasteir Ho },
title = { FIN-RoBERTa-Custom (Revision 1fb31e2) },
year = 2026,
url = { https://huggingface.co/alasteirho/FIN-RoBERTa-Custom },
doi = { 10.57967/hf/8367 },
publisher = { Hugging Face }
}
License
MIT License
- Downloads last month
- 453
Model tree for alasteirho/FIN-RoBERTa-Custom
Base model
FacebookAI/roberta-base