You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

TOBA LLM

Model description

TOBA LLM is a language model built upon the TOBA (Tokenisasi Optimal Berbasis Aglutinasi) tokenization scheme. This approach is inspired by the Gasing Literacy Learning System (https://gasingacademy.org/), an educational framework designed to teach Indonesian by integrating reading, writing, and pronunciation while addressing the local characteristics of the language.

The TOBA tokenization is optimized for the agglutinative nature of Indonesian. By integrating principles from human literacy education with computational optimization, TOBA LLM offers a highly efficient and linguistically nuanced approach to language processing. This convergence of pedagogical principles and advanced language modeling techniques makes TOBA LLM particularly suited for tasks requiring a deep understanding of Indonesian, such as educational tools, natural language processing applications, and content generation.

Usage

The script supports two modes: completion and chat.

Setup

Python 3.8 or higher is required. To install the necessary dependencies:

pip install -r requirements.txt

Completion Mode

Generates a continuation of a single input prompt.

python infer.py completion

After execution, a prompt can be entered in the terminal. The model will generate a corresponding completion.

Chat Mode

Enables multi-turn interaction with the model in a conversational format.

python infer.py chat

The model maintains conversational context across turns. Press Ctrl+C to exit the session.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ai-toba/toba-llm

Unable to build the model tree, the base model loops to the model itself. Learn more.