ModernBERT-large-madon-formalism

This model is a companion for the paper Mining Legal Arguments to Study Judicial Formalism. It is part of the MADON project, which aims to study judicial reasoning in the decisions of Czech Supreme Courts.

Model Description

This model is based on the ModernBERT architecture and was adapted to the Czech legal domain through continued pretraining on a corpus of over 300,000 Czech court decisions. It is specifically fine-tuned for the holistic formalism classification of full legal documents, identifying whether a judicial decision is formalistic or non-formalistic.

  • Task: Holistic formalism classification
  • Language: Czech
  • Dataset: MADON (Czech Supreme Court decisions)
  • Repository: trusthlt/madon

Usage

If you want to use it for holistic formalism classification of Czech legal court documents, we suggest:

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model = AutoModelForSequenceClassification.from_pretrained("TrustHLT/ModernBERT-large-madon-formalism")
tokenizer = AutoTokenizer.from_pretrained("TrustHLT/ModernBERT-large-madon-formalism")

pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)

# The expected input is a full Czech legal court document
text = "Dovolání se zamítá..." # Example Czech legal text

print(pipe(text))

Citation

If you use this model or the MADON dataset in your research, please cite the following paper:

@article{madon2025mining,
  title={Mining Legal Arguments to Study Judicial Formalism},
  author={Kocián, Michal and Šavelka, Jaromír and Moravčík, Jakub and Gavenčiak, Tomáš and Harašta, Jakub and Štefánik, Michal},
  journal={arXiv preprint arXiv:2512.11374},
  year={2025}
}
Downloads last month
28
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including TrustHLT/ModernBERT-large-madon-formalism

Paper for TrustHLT/ModernBERT-large-madon-formalism