Instructions to use FrontiersMind/Nandi-Mini-600M-Early-Checkpoint with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FrontiersMind/Nandi-Mini-600M-Early-Checkpoint with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FrontiersMind/Nandi-Mini-600M-Early-Checkpoint", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("FrontiersMind/Nandi-Mini-600M-Early-Checkpoint", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use FrontiersMind/Nandi-Mini-600M-Early-Checkpoint with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/FrontiersMind/Nandi-Mini-600M-Early-Checkpoint

SGLang

How to use FrontiersMind/Nandi-Mini-600M-Early-Checkpoint with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FrontiersMind/Nandi-Mini-600M-Early-Checkpoint",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use FrontiersMind/Nandi-Mini-600M-Early-Checkpoint with Docker Model Runner:
```
docker model run hf.co/FrontiersMind/Nandi-Mini-600M-Early-Checkpoint
```

vishesh-t27 commited on 8 days ago

Commit

ede3ae3

verified ·

1 Parent(s): 0f6bcc6

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -42

README.md CHANGED Viewed

@@ -49,24 +49,6 @@ Stay tuned!
 ---
-## Model Overview
-**Repository:** `FrontiersMind/Nandi-mini-500M-Early-Checkpoint`
-### Model Details
-- Type: Causal Language Model
-- Training Stage: Early Pretraining Checkpoint
-- Parameters: ~500M
-- Architecture: Transformer decoder
-- Positional Encoding: RoPE
-- Normalization: RMSNorm + QK Norm
-- Activation: SwiGLU
-- Attention: GQA + Shared KV
-- Embeddings: Tied embeddings with factorized design
-- Context Length: 2,048 tokens
-- Vocabulary Size: 131,072
 ### Architectural Highlights
@@ -112,20 +94,21 @@ This remains an active research area within the Nandi model family, and we plan
 ---
-## 🌍 Supported Languages
-The model is trained on English and multiple Indic languages, including:
-- Hindi
-- Bengali
-- Tamil
-- Telugu
-- Marathi
-- Gujarati
-- Kannada
-- Malayalam
-- Punjabi
-- Odia
 ---
@@ -133,7 +116,7 @@ The model is trained on English and multiple Indic languages, including:
 ## General Benchmarks
-| Model | Budget (T Tokens) | HellaSwag | WinoGrande | OBQA | PIQA | GPQA | ARC-e | ARC-c | MMLU | Average |
 |---|---|---|---|---|---|---|---|---|---|---|
 | MobiLlama-0.5B-Base | 1.3 | 39.65 | 53.67 | 30.60 | 70.35 | 24.33 | 52.82 | 23.63 | 24.18 | 39.90 |
 | Qwen-2-0.5B-Base | 12 | 49.01 | 57.69 | 33.20 | 68.98 | 27.23 | 54.79 | 25.42 | 44.06 | 45.05 |
@@ -142,7 +125,7 @@ The model is trained on English and multiple Indic languages, including:
 | Qwen3.5-0.8B-Base | 36 | 54.87 | 60.54 | 35.80 | 70.02 | 31.25 | 70.50 | 38.23 | 52.73 | 51.74 |
 | SmolLM-360M-Base | 0.6 | 53.33 | 57.22 | 37.60 | 70.56 | 21.20 | 70.24 | 33.27 | 24.92 | 46.04 |
 | SmolLM2-360M-Base | 4 | 56.30 | 59.19 | 37.60 | 71.81 | 25.22 | 67.88 | 36.68 | 25.55 | 47.53 |
-| **Nandi-Mini-500M-Early-Checkpoint** | **0.5** | **44.86** | **54.77** | **34.80** | **68.60** | **26.33** | **64.73** | **29.70** | **29.01** | **44.10** |
 ---
@@ -164,20 +147,14 @@ The model is trained on English and multiple Indic languages, including:
 | Telugu    | 15.40 | 13.38 | 2.09 | **1.77** |
 | Assamese  | 9.26 | 8.13 | 4.31 | **1.51** |
-### Why Fertility Matters
-Lower fertility scores indicate more efficient tokenization, meaning fewer tokens are needed to represent text in a language.
-This leads to:
-- Better context utilization
-- Lower inference cost
-- Reduced latency
-- Improved multilingual efficiency
-Nandi-Mini’s tokenizer is heavily optimized for Indic languages and demonstrates strong compression efficiency across several scripts.
----
 # 🚀 Usage

 ---
 ### Architectural Highlights
 ---
+### Model Details
+- Type: Causal Language Model
+- Training Stage: Early Pretraining Checkpoint
+- Parameters: ~500M
+- Architecture: Transformer decoder
+- Positional Encoding: RoPE
+- Normalization: RMSNorm + QK Norm
+- Activation: SwiGLU
+- Attention: GQA + Shared KV
+- Embeddings: Tied embeddings with factorized design
+- Context Length: 2,048 tokens
+- Vocabulary Size: 131,072
 ---
 ## General Benchmarks
+| Model | Trained Tokens | HellaSwag | WinoGrande | OBQA | PIQA | GPQA | ARC-e | ARC-c | MMLU | Average |
 |---|---|---|---|---|---|---|---|---|---|---|
 | MobiLlama-0.5B-Base | 1.3 | 39.65 | 53.67 | 30.60 | 70.35 | 24.33 | 52.82 | 23.63 | 24.18 | 39.90 |
 | Qwen-2-0.5B-Base | 12 | 49.01 | 57.69 | 33.20 | 68.98 | 27.23 | 54.79 | 25.42 | 44.06 | 45.05 |
 | Qwen3.5-0.8B-Base | 36 | 54.87 | 60.54 | 35.80 | 70.02 | 31.25 | 70.50 | 38.23 | 52.73 | 51.74 |
 | SmolLM-360M-Base | 0.6 | 53.33 | 57.22 | 37.60 | 70.56 | 21.20 | 70.24 | 33.27 | 24.92 | 46.04 |
 | SmolLM2-360M-Base | 4 | 56.30 | 59.19 | 37.60 | 71.81 | 25.22 | 67.88 | 36.68 | 25.55 | 47.53 |
+| **Nandi-Mini-600M-Early-Checkpoint-Base** | **0.2** | 44.86 | 54.77 | 34.80 | 68.60 | 26.33 | 64.73 | 29.70 | 29.01 | 44.10 |
 ---
 | Telugu    | 15.40 | 13.38 | 2.09 | **1.77** |
 | Assamese  | 9.26 | 8.13 | 4.31 | **1.51** |
+---
+## 🌍 Supported Languages
+The model is trained on English and a diverse set of Indic languages, including:
+- Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Odia
 # 🚀 Usage