An NLI-Based Approach to Asset-Specific Stance Detection in Cryptocurrency Tweets

This model classifies the stance of tweets toward Bitcoin (BTC) and Ethereum (ETH) as Bullish, Bearish, or Neutral using a Natural Language Inference (NLI) approach.

It was fine-tuned from FacebookAI/roberta-large-mnli as part of a master's thesis on NLI-based cryptocurrency stance detection.

How it works

Instead of standard 3-class classification, this model frames stance detection as an entailment task. For each tweet, three hypotheses are constructed (one per stance), and the model scores which hypothesis is most entailed by the tweet:

Stance	Hypothesis
Bullish	"This tweet expresses a bullish perspective on the potential of {target}."
Bearish	"This tweet expresses a bearish perspective on the potential of {target}."
Neutral	"This tweet expresses a neutral perspective on the potential of {target}."

The predicted stance is the one with the highest entailment score.

Usage

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="syahrezapratama/roberta-crypto-stance")

tweet = "Bitcoin is going to the moon! $100k is just the beginning 🚀"

result = classifier(
    tweet,
    candidate_labels=["bullish", "bearish", "neutral"],
    hypothesis_template="This tweet expresses a {} perspective on the potential of BTC.",
)

print(result["labels"][0])  # "bullish"

Manual inference (more control)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "syahrezapratama/roberta-crypto-stance"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

tweet = "I'm not sure where ETH is headed, could go either way"
stances = ["bullish", "bearish", "neutral"]
template = "This tweet expresses a {} perspective on the potential of ETH."

scores = []
for stance in stances:
    hypothesis = template.format(stance)
    inputs = tokenizer(tweet, hypothesis, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        logits = model(**inputs).logits
    # Entailment is index 2 for RoBERTa-MNLI
    entailment_score = torch.softmax(logits, dim=-1)[0, 2].item()
    scores.append(entailment_score)

predicted = stances[scores.index(max(scores))]
print(f"Predicted stance: {predicted}")

Performance

Evaluated on a held-out test set of 450 tweets (70/15/15 train/val/test split, seed=42).

Overall metrics

Metric	Value
Accuracy	78.22%
Macro F1	0.7376
Weighted F1	0.7850

Per-class metrics

Class	Precision	Recall	F1	Support
Bearish	0.6522	0.6383	0.6452	47
Neutral	0.8780	0.7941	0.8340	272
Bullish	0.6709	0.8092	0.7336	131

Comparison with baselines

Model	Paradigm	Accuracy	Macro F1
RoBERTa-MNLI	Zero-Shot (Baseline)	46.67%	0.4467
RoBERTa-MNLI	Zero-Shot (OPRO)	55.11%	0.5009
RoBERTa-NLI	Fine-Tuned	78.22%	0.7376
GPT-4o	Zero-Shot	76.67%	0.7275

Training details

Dataset

Source: 3,000 cryptocurrency tweets about BTC and ETH
Labels: Bullish (873), Neutral (1,814), Bearish (313)
Split: 2,099 train / 451 val / 450 test (seed=42)

NLI training approach

Each tweet is expanded into 3 NLI premise-hypothesis pairs:

Correct stance → entailment
Incorrect stances → contradiction

This results in 6,297 training pairs from 2,099 tweets.

Hyperparameters

Parameter	Value
Base model	FacebookAI/roberta-large-mnli
Learning rate	2e-5
Batch size	4 (physical) × 4 (accumulation) = 16 (effective)
Max epochs	5 (early stopping patience = 3)
Max sequence length	128
Warmup	10% linear warmup + linear decay
Weight decay	0.01
Class weights	Bearish=3.19, Neutral=0.55, Bullish=1.15
Gradient checkpointing	Enabled
Optimizer	AdamW
Best epoch	Early stopped based on validation macro F1

Hypothesis template (OPRO-optimized)

The hypothesis template was optimized using OPRO (Optimization by PROmpting) with GPT-4o-mini:

"This tweet expresses a {stance} perspective on the potential of {target}."

Limitations

Domain-specific: Trained only on cryptocurrency tweets (BTC and ETH). May not generalize to other financial assets or domains.
Class imbalance: Bearish tweets are underrepresented (10.4% of data), leading to lower recall on bearish stance despite class weighting.
Language: English only.
Temporal: Trained on tweets from a specific time period. Cryptocurrency language and sentiment patterns evolve rapidly.

Citation

If you use this model, please cite:

@mastersthesis{pratama2026cryptostancenli,
  title={An NLI-Based Approach to Asset-Specific Stance Detection in Cryptocurrency Tweets},
  author={Pratama, Syahreza},
  year={2026},
}

License

MIT

Downloads last month: 45

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support