---
title: LLiMba 3B Demo
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 6.14.0
app_file: app.py
pinned: false
disable_embedding: true
license: apache-2.0
short_description: Chat with LLiMba, an open 3B LLM that speaks Sardinian
hardware: zero-a10g
models:
  - lballore/llimba-3b-instruct
tags:
  - sardinian
  - limba-sarda-comuna
  - lsc
  - low-resource
  - endangered-language
  - chat
  - demo
---

# 💬 LLiMba 3B Demo

A live demo of [**LLiMba-3B-Instruct**](https://huggingface.co/lballore/llimba-3b-instruct), an open 3B-parameter Sardinian-speaking language model adapted from Qwen2.5-3B-Instruct on a single consumer GPU.

LLiMba speaks fluent **Sardinian** (LSC, the standardized written form), accepts **Logudorese** and **Campidanese** input, and retains the multilingual capabilities of its Qwen2.5 base, so it also handles English, Italian, Spanish, and other Romance languages.

## Try these prompts

If you don't speak Sardinian, here are starters that work well:

**Conversation:**
- *Salude! Comente ìstas?* (Hi! How are you?)

**Translation:**
- *Translate to Sardinian: "The Mediterranean is rough today."*
- *Traduzi in italianu: «Sa Sardigna est una ìsula bella meda in su Mediterràneu.»*

**Sardinian culture and history:**
- *Chie fiat Gigi Riva?* (Who was Gigi Riva?)
- *Ite est su «cantu a tenore» sardu?* (What is Sardinian *cantu a tenore*?)

**Open-ended:**
- *Iscrie unu paragrafu in sardu subra de sa Sardigna.* (Write a paragraph in Sardinian about Sardinia.)

## Inference settings

The sidebar exposes the standard generation parameters. Defaults are tuned for natural Sardinian conversation; the model card recommends three preset profiles depending on use case:

| Use case | Temperature | Top-p | Top-k | Repetition penalty |
|---|---:|---:|---:|---:|
| Translation, factual Q&A | 0.0 (greedy) | 1.0 | 1 | 1.0 |
| Conversational chat (default) | 0.3 | 0.9 | 40 | 1.05 |
| Creative or long-form | >=0.5 | 0.9 | 40 | 1.1 |

Temperatures above 0.7 may cause language-boundary drift (Sardinian to Italian) and amplify morphological hallucination on long open-ended prompts. The model was trained with Romance replay data to mitigate this, but the safe upper bound for production use is around 0.7.

## About the project

LLiMba is an open project to bring Sardinian, an endangered Romance language with roughly one million speakers, into modern NLP. The full release includes:

- 🤖 **Model:** [lballore/llimba-3b-instruct](https://huggingface.co/lballore/llimba-3b-instruct)
- 🤖 **Intermediate checkpoint:** [lballore/llimba-3b-instruct-cpt](https://huggingface.co/lballore/llimba-3b-instruct-cpt) (post-CPT, pre-SFT, for researchers)
- 📚 **Pretraining corpus:** [lballore/llimba-corpus](https://huggingface.co/datasets/lballore/llimba-corpus) (~13.9M tokens)
- 📚 **SFT data:** [lballore/llimba-sft](https://huggingface.co/datasets/lballore/llimba-sft) (~14K instruction pairs)
- 📚 **Eval set:** [lballore/llimba-flores-srd-eval](https://huggingface.co/datasets/lballore/llimba-flores-srd-eval) (FLORES-200 subset)
- 💻 **Code:** [github.com/lballore/LLiMba](https://github.com/lballore/LLiMba)
- 📖 **Paper:** [LLiMba: Sardinian on a Single GPU](https://arxiv.org/abs/2605.09015)

The model was adapted via continued pretraining on Sardinian text (with Romance replay data to prevent language drift) followed by supervised fine-tuning with rsLoRA on instruction pairs. Full methodology and benchmarks in the [model card](https://huggingface.co/lballore/llimba-3b-instruct).

## API access

The Space's `gradio_client` API is gated behind an allowlist. Visiting this demo page in a browser works without any token; programmatic access via `gradio_client` or HTTP requires a token assigned by the project maintainer.

If you'd like to integrate LLiMba into a research project or non-commercial application, [open a GitHub issue](https://github.com/lballore/LLiMba/issues) describing your use case. Approved integrators receive a token to pass with each call.

Once you have a token, the calling pattern depends on your language.

**Python** with [`gradio_client`](https://www.gradio.app/guides/getting-started-with-the-python-client):

```python
from gradio_client import Client

client = Client("lballore/llimba-demo")

result = client.predict(
    message="Hello! Please respond in Sardinian.",
    system_message="Ses unu assistente chi chistionat in sardu.",
    max_tokens=512,
    temperature=0.3,
    top_p=0.9,
    top_k=40,
    repetition_penalty=1.05,
    api_token="<YOUR_ASSIGNED_TOKEN>",
    api_name="/chat",
)
print(result)
```

**JavaScript / TypeScript** with [`@gradio/client`](https://www.gradio.app/guides/getting-started-with-the-js-client) (Node.js or any server-side runtime; never call from the browser, the token would be exposed):

```typescript
import { Client } from "@gradio/client";

const client = await Client.connect("lballore/llimba-demo");

const result = await client.predict("/chat", {
  message: "Hello! Please respond in Sardinian.",
  system_message: "Ses unu assistente chi chistionat in sardu.",
  max_tokens: 512,
  temperature: 0.3,
  top_p: 0.9,
  top_k: 40,
  repetition_penalty: 1.05,
  api_token: process.env.LLIMBA_API_TOKEN,
});

console.log(result.data);
```

**Other languages** (PHP, Go, Ruby, Rust, etc.) can call the underlying HTTP API directly. The pattern is two requests: `POST /gradio_api/call/chat` with the inputs as a `data` array (returns an `event_id`), then `GET /gradio_api/call/chat/{event_id}` as Server-Sent Events to receive the response. Inputs in the `data` array must be in the same order as the Python and JavaScript examples above.

The translation endpoint follows the same pattern with `api_name="/translate"` and source/target language arguments instead of a system message; full input schemas are visible on the Space's "Use via API" tab.

Tokens are subject to revocation if usage patterns suggest abuse or significantly degrade the demo's responsiveness for casual visitors.

## Limitations

This is a 3B-parameter model and like all small LLMs it can produce confident wrong answers, especially on biographical or historical specifics. Treat factual outputs with skepticism.

On long open-ended generation, the model may occasionally produce phonotactically valid but non-attested Sardinian words. Structured prompts ("List the three main causes") work more reliably than open-ended ones ("Tell me about X").

The model targets LSC and was reviewed by a single native speaker (the author himself). Logudorese and Campidanese input is handled, but speakers of those variants may find outputs skew toward the standardized form.

## License

The demo code is released under [Apache 2.0](LICENSE).
The model weights are also released under [Apache 2.0](LICENSE).