Spaces:
Running on Zero
Running on Zero
| title: LLiMba 3B Demo | |
| emoji: 💬 | |
| colorFrom: yellow | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.14.0 | |
| app_file: app.py | |
| pinned: false | |
| disable_embedding: true | |
| license: apache-2.0 | |
| short_description: Chat with LLiMba, an open 3B LLM that speaks Sardinian | |
| hardware: zero-a10g | |
| models: | |
| - lballore/llimba-3b-instruct | |
| tags: | |
| - sardinian | |
| - limba-sarda-comuna | |
| - lsc | |
| - low-resource | |
| - endangered-language | |
| - chat | |
| - demo | |
| # 💬 LLiMba 3B Demo | |
| A live demo of [**LLiMba-3B-Instruct**](https://huggingface.co/lballore/llimba-3b-instruct), an open 3B-parameter Sardinian-speaking language model adapted from Qwen2.5-3B-Instruct on a single consumer GPU. | |
| LLiMba speaks fluent **Sardinian** (LSC, the standardized written form), accepts **Logudorese** and **Campidanese** input, and retains the multilingual capabilities of its Qwen2.5 base, so it also handles English, Italian, Spanish, and other Romance languages. | |
| ## Try these prompts | |
| If you don't speak Sardinian, here are starters that work well: | |
| **Conversation:** | |
| - *Salude! Comente ìstas?* (Hi! How are you?) | |
| **Translation:** | |
| - *Translate to Sardinian: "The Mediterranean is rough today."* | |
| - *Traduzi in italianu: «Sa Sardigna est una ìsula bella meda in su Mediterràneu.»* | |
| **Sardinian culture and history:** | |
| - *Chie fiat Gigi Riva?* (Who was Gigi Riva?) | |
| - *Ite est su «cantu a tenore» sardu?* (What is Sardinian *cantu a tenore*?) | |
| **Open-ended:** | |
| - *Iscrie unu paragrafu in sardu subra de sa Sardigna.* (Write a paragraph in Sardinian about Sardinia.) | |
| ## Inference settings | |
| The sidebar exposes the standard generation parameters. Defaults are tuned for natural Sardinian conversation; the model card recommends three preset profiles depending on use case: | |
| | Use case | Temperature | Top-p | Top-k | Repetition penalty | | |
| |---|---:|---:|---:|---:| | |
| | Translation, factual Q&A | 0.0 (greedy) | 1.0 | 1 | 1.0 | | |
| | Conversational chat (default) | 0.3 | 0.9 | 40 | 1.05 | | |
| | Creative or long-form | >=0.5 | 0.9 | 40 | 1.1 | | |
| Temperatures above 0.7 may cause language-boundary drift (Sardinian to Italian) and amplify morphological hallucination on long open-ended prompts. The model was trained with Romance replay data to mitigate this, but the safe upper bound for production use is around 0.7. | |
| ## About the project | |
| LLiMba is an open project to bring Sardinian, an endangered Romance language with roughly one million speakers, into modern NLP. The full release includes: | |
| - 🤖 **Model:** [lballore/llimba-3b-instruct](https://huggingface.co/lballore/llimba-3b-instruct) | |
| - 🤖 **Intermediate checkpoint:** [lballore/llimba-3b-instruct-cpt](https://huggingface.co/lballore/llimba-3b-instruct-cpt) (post-CPT, pre-SFT, for researchers) | |
| - 📚 **Pretraining corpus:** [lballore/llimba-corpus](https://huggingface.co/datasets/lballore/llimba-corpus) (~13.9M tokens) | |
| - 📚 **SFT data:** [lballore/llimba-sft](https://huggingface.co/datasets/lballore/llimba-sft) (~14K instruction pairs) | |
| - 📚 **Eval set:** [lballore/llimba-flores-srd-eval](https://huggingface.co/datasets/lballore/llimba-flores-srd-eval) (FLORES-200 subset) | |
| - 💻 **Code:** [github.com/lballore/LLiMba](https://github.com/lballore/LLiMba) | |
| - 📖 **Paper:** [LLiMba: Sardinian on a Single GPU](https://arxiv.org/abs/2605.09015) | |
| The model was adapted via continued pretraining on Sardinian text (with Romance replay data to prevent language drift) followed by supervised fine-tuning with rsLoRA on instruction pairs. Full methodology and benchmarks in the [model card](https://huggingface.co/lballore/llimba-3b-instruct). | |
| ## API access | |
| The Space's `gradio_client` API is gated behind an allowlist. Visiting this demo page in a browser works without any token; programmatic access via `gradio_client` or HTTP requires a token assigned by the project maintainer. | |
| If you'd like to integrate LLiMba into a research project or non-commercial application, [open a GitHub issue](https://github.com/lballore/LLiMba/issues) describing your use case. Approved integrators receive a token to pass with each call. | |
| Once you have a token, the calling pattern depends on your language. | |
| **Python** with [`gradio_client`](https://www.gradio.app/guides/getting-started-with-the-python-client): | |
| ```python | |
| from gradio_client import Client | |
| client = Client("lballore/llimba-demo") | |
| result = client.predict( | |
| message="Hello! Please respond in Sardinian.", | |
| system_message="Ses unu assistente chi chistionat in sardu.", | |
| max_tokens=512, | |
| temperature=0.3, | |
| top_p=0.9, | |
| top_k=40, | |
| repetition_penalty=1.05, | |
| api_token="<YOUR_ASSIGNED_TOKEN>", | |
| api_name="/chat", | |
| ) | |
| print(result) | |
| ``` | |
| **JavaScript / TypeScript** with [`@gradio/client`](https://www.gradio.app/guides/getting-started-with-the-js-client) (Node.js or any server-side runtime; never call from the browser, the token would be exposed): | |
| ```typescript | |
| import { Client } from "@gradio/client"; | |
| const client = await Client.connect("lballore/llimba-demo"); | |
| const result = await client.predict("/chat", { | |
| message: "Hello! Please respond in Sardinian.", | |
| system_message: "Ses unu assistente chi chistionat in sardu.", | |
| max_tokens: 512, | |
| temperature: 0.3, | |
| top_p: 0.9, | |
| top_k: 40, | |
| repetition_penalty: 1.05, | |
| api_token: process.env.LLIMBA_API_TOKEN, | |
| }); | |
| console.log(result.data); | |
| ``` | |
| **Other languages** (PHP, Go, Ruby, Rust, etc.) can call the underlying HTTP API directly. The pattern is two requests: `POST /gradio_api/call/chat` with the inputs as a `data` array (returns an `event_id`), then `GET /gradio_api/call/chat/{event_id}` as Server-Sent Events to receive the response. Inputs in the `data` array must be in the same order as the Python and JavaScript examples above. | |
| The translation endpoint follows the same pattern with `api_name="/translate"` and source/target language arguments instead of a system message; full input schemas are visible on the Space's "Use via API" tab. | |
| Tokens are subject to revocation if usage patterns suggest abuse or significantly degrade the demo's responsiveness for casual visitors. | |
| ## Limitations | |
| This is a 3B-parameter model and like all small LLMs it can produce confident wrong answers, especially on biographical or historical specifics. Treat factual outputs with skepticism. | |
| On long open-ended generation, the model may occasionally produce phonotactically valid but non-attested Sardinian words. Structured prompts ("List the three main causes") work more reliably than open-ended ones ("Tell me about X"). | |
| The model targets LSC and was reviewed by a single native speaker (the author himself). Logudorese and Campidanese input is handled, but speakers of those variants may find outputs skew toward the standardized form. | |
| ## License | |
| The demo code is released under [Apache 2.0](LICENSE). | |
| The model weights are also released under [Apache 2.0](LICENSE). | |