shisa-ai
/

shisa-v2.1c-lfm2-350m

Text Generation

Generated from Trainer

Model card Files Files and versions

Model Card for shisa-ai/shisa-v2.1c-lfm2-350m

SOTA Japanese Shaberi Benchmarks @ <0.5B and <1B!

This was made just for fun on a Saturday for the Liquid AI Hackathon, but this is an early preview of our upcoming V2.1 models...

For full code and related evals see:

https://github.com/lhl/liquid-ai-hackathon-tokyo/tree/main

Presentation here:

https://docs.google.com/presentation/d/1wZmfGE3bDloHNXjvYQBfGJ5jH-d1ksX9LGQn_aVfHbI/edit?usp=sharing

Model	Average	ELYZA 100	JA-MT	Rakuda	Tengu
google/gemma-3-4b-it	6.44	7.34	6.78	5.68	5.97
045-llama3.2-1b-v2new-dpo405b	5.40	5.44	5.22	6.35	4.61
037-rakuten-2.0-mini-instruct-1.5b-v2new-dpo405b	5.10	5.42	4.60	5.68	4.70
augmxnt/shisa-gamma-7b-v1	4.80	5.86	4.07	4.55	4.72
shisa-ai/shisa-v2.1c-lfm2-350m	4.51	4.30	4.75	5.03	3.95
meta-llama/Llama-3.2-3B-Instruct	4.49	5.62	4.50	3.43	4.43
Qwen/Qwen3-0.6B	4.14	5.16	4.00	3.18	4.23
augmxnt/shisa-7b-v1	3.95	4.36	3.75	3.88	3.83
shisa-ai/shisa-v2.1c-lfm2-350m-sft3-tlonly	3.87	3.78	3.70	4.50	3.51
LiquidAI/LFM2-350M	3.76	3.92	4.07	3.55	3.51
meta-llama/Llama-3.2-1B-Instruct	2.97	3.82	2.82	2.45	2.79
google/gemma3-270m-it	2.53	3.42	2.33	2.10	2.28
LiquidAI/LFM2-350M-ENJP-MT	1.69	2.98	1.37	1.00	1.42
tiiuae/Falcon-H1-0.5B-Instruct	1.30	2.32	1.47	1.00	0.41

Framework versions

TRL: 0.23.0
Transformers: 4.56.1
Pytorch: 2.10.0.dev20251008+cu130
Datasets: 4.2.0
Tokenizers: 0.22.1

Compute

This model was trained on an 8xMI300X node on the AMD Developer Cloud with compute generously sponsored by AMD.

Downloads last month: 7

Safetensors

Model size

0.4B params

Tensor type

BF16

·

Model tree for shisa-ai/shisa-v2.1c-lfm2-350m

Base model

LiquidAI/LFM2-350M

Finetuned

LiquidAI/LFM2-350M-ENJP-MT

Finetuned

(2)

this model

Quantizations