---
license: cc-by-nc-sa-4.0
---


We released the suite of models we trained as part of our work on scaling laws of decoder-only machine translation systems. This work has been published in WMT24 and is available [here](https://aclanthology.org/2024.wmt-1.124/).

These models have been trained on a mixture of general and financial sentences on 11 language directions. They support 8 languages (English, French, German, Italian, Spanish, Dutch, Swedish and Portuguese) as well as 9 domains (general + 8 financial subdomains). They are not tailored for document-level translation.

A running demo of these models is available on [our dedicated space](https://huggingface.co/spaces/DragonLLM/FinTranslate-Demo).

## Evaluation

The below table details the performance of our models on general domain translation.

| Model       | BLEU      | COMET     | COMET-Kiwi |
| ----------- | --------- | --------- | ---------- |
| FinTranslate-70M | 29.62     | 81.31     | 80.72      |
| FinTranslate-160M | 32.43     | 84.00     | 83.45      |
| FinTranslate-410M        | 33.60     | 84.81     | 84.14      |
| FinTranslate-Bronze   | 34.08     | 85.10     | 84.35      |
| FinTranslate-Silver   | 34.42     | 85.10     | 84.33      |
| FinTranslate-Gold  | **36.07** | 85.88     | 84.82      |
|             |           |           |            |
| Llama3.1 8B | 30.43     | 84.82     | 84.47      |
| Mistral 7B  | 23.26     | 80.08     | 82.29      |
| Tower 7B    | 33.50     | **85.91** | **85.02**  |


The below table details the performance of our models on financial translation.

| Model        | BLEU      | COMET     | COMET-Kiwi |
| ------------ | --------- | --------- | ---------- |
| FinTranslate-70M          | 44.63     | 86.95     | 80.88      |
| FinTranslate-160M         | 49.02     | 88.27     | 81.80      |
| FinTranslate-410M         | 50.85     | 88.64     | 81.73      |
| FinTranslate-Bronze         | 52.00     | 88.85     | 81.71      |
| FinTranslate-Silver          | 53.28     | **89.98** | 81.61      |
| FinTranslate-Gold         | **58.34** | 89.62     | 81.35      |
|              |           |           |            |
| Llama 3.1 8B | 34.99     | 84.42     | 81.75      |
| Mistral 7B   | 38.93     | 76.52     | 76.17      |
| Tower 7B     | 38.93     | 86.49     | **82.66**  |


## How to use it

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

LANGUAGES = ["en", "de", "es", "fr", "it", "nl", "sv", "pt"]
DOMAINS = {
    "Asset manangement marketing": "am",
    "Annual report": "ar",
    "Corporate action": "corporateAction",
    "Equity research": "equi",
    "Fund fact sheet": "ffs",
    "Kiid": "kiid",
    "Life insurance": "lifeInsurance",
    "Regulatory": "regulatory",
    "General": "general",
}


def language_token(lang):
    return f"<lang_{lang}>"


def domain_token(dom):
    return f"<dom_{dom}>"


def format_input(src, tgt_lang, src_lang, domain):
    assert tgt_lang in LANGUAGES
    tgt_lang_token = language_token(tgt_lang)
    # Please read our paper to understand why we need to prefix the input with <eos>
    base_input = f"<eos>{src}</src>{tgt_lang_token}"
    if src_lang is None:
        return base_input
    else:
        assert src_lang in LANGUAGES
        src_lang_token = language_token(src_lang)
        base_input = f"{base_input}{src_lang_token}"
    if domain is None:
        return base_input
    else:
        domain = DOMAINS.get(domain, "general")
        dom_token = domain_token(domain)
        base_input = f"{base_input}{dom_token}"
    return base_input


model_id = "DragonLLM/FinTranslate-Silver"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

source_sentence = "Dragon LLM est une entreprise française spécialisé dans le domaine de l'IA générative."
formatted_sentence = format_input(source_sentence, "en", "fr", "General")
inputs = tokenizer(formatted_sentence, return_tensors="pt", return_token_type_ids=False)
outputs = model.generate(**inputs, max_new_tokens=64)

input_size = inputs["input_ids"].size(1)
translated_sentence = tokenizer.decode(
    outputs[0, input_size:], skip_special_tokens=True
)
print(translated_sentence)
# Dragon LLM is a French company specialized in the field of generative AI.

```

## Citing this work

If you use this model in your work, please cite it as:

```
@inproceedings{caillaut-etal-2024-scaling,
    title = "Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task",
    author = {Caillaut, Ga{\"e}tan  and
      Nakhl{\'e}, Mariam  and
      Qader, Raheel  and
      Liu, Jingshu  and
      Barth{\'e}lemy, Jean-Gabriel},
    editor = "Haddow, Barry  and
      Kocmi, Tom  and
      Koehn, Philipp  and
      Monz, Christof",
    booktitle = "Proceedings of the Ninth Conference on Machine Translation",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.wmt-1.124/",
    doi = "10.18653/v1/2024.wmt-1.124",
    pages = "1318--1331"
}
```