ModernBERT-large β€” Consumer Finance Complaint Classification (11 labels)

Fine-tuned ModernBERT-large (~395M params) for multi-class classification of US consumer finance complaints into 11 consolidated categories.

Context

This model was developed as part of Project 12 β€” "Compare AI Algorithms: Machine Learning vs. LLM" of the OpenClassrooms AI Developer certification by William Derue.

The project scenario involves ZenAssist, a customer support platform serving 200+ companies. The goal is to automatically label incoming consumer complaints to route them to the correct support department, comparing traditional ML approaches with LLM-based inference.

This fine-tuned encoder model represents the supervised ML approach β€” a single forward pass through ModernBERT produces classification logits, making it fast, cheap to run, and suitable for high-throughput production deployment.

Labels

The original dataset contains 18 raw product tags, consolidated into 11 categories to reduce semantic overlap and improve model performance:

# Label Description
0 Bank account Bank account or service, Checking or savings account
1 Consumer Loan Consumer Loan
2 Credit card Credit card, Credit card or prepaid card, Prepaid card
3 Credit reporting Credit reporting, credit repair services, or other personal consumer reports
4 Debt collection Debt collection
5 Money transfer Money transfer, virtual currency, or money service
6 Mortgage Mortgage
7 Other financial service Other financial service, Virtual currency
8 Payday loan Payday loan, title loan, or personal loan
9 Student loan Student loan
10 Vehicle loan or lease Vehicle loan or lease

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="WillisBack/modernbert-large-consumer-finance-11cls",
    truncation=True,
    max_length=1024,
)

result = classifier("I was charged an overdraft fee on my checking account without prior notice.")
print(result)
# [{'label': 'Bank account', 'score': 0.95}]

Training

Parameter Value
Base model unsloth/ModernBERT-large
Architecture ModernBertForSequenceClassification
Parameters ~395M (full finetuning, no LoRA)
Train samples 293,698
Eval samples 5,000
Labels 11
Max epochs 2
Effective epochs 1.6 (early stopping, patience=3 on F1 macro)
Batch size (effective) 64 (16 Γ— grad_accum 4)
Learning rate 2e-5
Warmup steps 200
Scheduler Linear decay
Optimizer AdamW
Weight decay 0.01
Max sequence length 1024
Precision bf16
Loss CrossEntropyLoss (class-weighted)
torch.compile Enabled
GPU NVIDIA GeForce RTX 5080 (16 GB GDDR7)
Training time 380 min (6.3h)
Peak VRAM 4.58 GB (29.6%)

Class weights

Computed with sklearn.utils.class_weight.compute_class_weight("balanced") to handle class imbalance:

Label Weight
Credit reporting 0.303
Debt collection 0.396
Mortgage 0.630
Credit card 0.804
Bank account 1.204
Student loan 1.532
Consumer Loan 3.535
Money transfer 4.800
Payday loan 5.422
Vehicle loan or lease 5.835
Other financial service 109.425

Results

Metric Value
F1 macro 0.6126
Accuracy 78.2%
Weighted F1 0.79
Train loss 0.9178

Per-class performance

Label Precision Recall F1 Support
Mortgage 0.90 0.91 0.90 726
Student loan 0.74 0.93 0.83 277
Credit reporting 0.89 0.76 0.82 1,509
Debt collection 0.82 0.77 0.79 1,119
Bank account 0.73 0.79 0.76 392
Credit card 0.75 0.78 0.76 602
Money transfer 0.65 0.78 0.71 95
Payday loan 0.36 0.62 0.46 80
Vehicle loan or lease 0.29 0.50 0.37 74
Consumer Loan 0.29 0.39 0.33 122
Other financial service 0.00 0.00 0.00 4

Observations

  • Strong classes (F1 β‰₯ 0.70): Mortgage, Student loan, Credit reporting, Debt collection, Bank account, Credit card, Money transfer β€” these cover ~94% of the evaluation set.
  • Weak classes: Consumer Loan, Payday loan, Vehicle loan or lease suffer from semantic overlap (all are loan products) and low sample counts.
  • Other financial service (4 eval samples, 244 train samples) remains unlearnable at this scale. Consider merging with the nearest class or removing for production.
  • Early stopping triggered at epoch 1.6 β€” the model converged before completing 2 full epochs.

Dataset

Trained on WillisBack/dataset-financial-user-claim β€” a cleaned, deduplicated, and label-consolidated version of the US CFPB Consumer Complaints dataset.

Files

File Description
model.safetensors Model weights (~792 MB)
config.json Model architecture config with id2label/label2id
tokenizer.json Tokenizer vocabulary
tokenizer_config.json Tokenizer settings
label_config.json Label list, id2label, label2id, model name, max_seq_length

Limitations

  • Trained on English-language US CFPB complaints only. Performance on other languages or domains is unknown.
  • Tail classes (Other financial service, Consumer Loan, Vehicle loan or lease) have low F1 β€” predictions on these should be treated with lower confidence.
  • Max input length is 1024 tokens. Longer complaints are truncated.

Citation

@misc{derue2026modernbert-finance,
  author = {Derue, William},
  title = {ModernBERT-large Fine-tuned for Consumer Finance Complaint Classification},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/WillisBack/modernbert-large-consumer-finance-11cls}
}
Downloads last month
21
Safetensors
Model size
0.4B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for WillisBack/modernbert-large-consumer-finance-11cls

Finetuned
(262)
this model

Dataset used to train WillisBack/modernbert-large-consumer-finance-11cls

Evaluation results

  • F1 Macro on Consumer Finance Complaints (11 labels)
    self-reported
    0.613
  • Accuracy on Consumer Finance Complaints (11 labels)
    self-reported
    0.782