Upload BONSAI_LIMITATIONS.md
Browse files- BONSAI_LIMITATIONS.md +98 -0
BONSAI_LIMITATIONS.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# β οΈ Bonsai (PrismML) β Training Limitation Report
|
| 2 |
+
|
| 3 |
+
## Status: β NOT SUPPORTED by Unsloth for Fine-Tuning
|
| 4 |
+
|
| 5 |
+
### What is Bonsai?
|
| 6 |
+
|
| 7 |
+
[Bonsai](https://prismml.com/) by **PrismML** is an extremely lightweight LLM family using **1-bit ternary quantization**. The 8B parameter model compresses to approximately **1GB**, making it one of the smallest high-parameter-count models available.
|
| 8 |
+
|
| 9 |
+
- **HF Collection:** https://huggingface.co/collections/prism-ml/bonsai
|
| 10 |
+
- **Demo Repo:** https://github.com/PrismML-Eng/Bonsai-demo
|
| 11 |
+
- **Architecture:** `Qwen3ForCausalLM` (for the unpacked version)
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Why Bonsai Cannot Be Fine-Tuned with Unsloth (or Standard PEFT)
|
| 16 |
+
|
| 17 |
+
### 1. **1-Bit Ternary Weights Are Incompatible with LoRA**
|
| 18 |
+
|
| 19 |
+
| Property | Standard Models (Qwen, Gemma, Llama) | Bonsai |
|
| 20 |
+
|----------|--------------------------------------|--------|
|
| 21 |
+
| Weight precision | FP16/BF16/FP32 | **1-bit ternary** (-1, 0, +1) |
|
| 22 |
+
| Quantization | 4-bit (bnb) or 8-bit | **Custom 1-bit kernels** |
|
| 23 |
+
| Unsloth support | β
Yes | β No |
|
| 24 |
+
| LoRA/QLoRA | β
Works | β Requires FP16 base weights |
|
| 25 |
+
| bitsandbytes | β
Compatible | β Incompatible |
|
| 26 |
+
|
| 27 |
+
**The core issue:** LoRA fine-tuning works by adding small, trainable FP16 matrices (A and B) to frozen base weights. Bonsai's base weights are stored in a custom 1-bit format that:
|
| 28 |
+
- Cannot be dequantized to FP16 in a way that supports gradient flow
|
| 29 |
+
- Requires PrismML's proprietary CUDA kernels for inference
|
| 30 |
+
- Does not have an `AutoModelForCausalLM` compatible weight format
|
| 31 |
+
|
| 32 |
+
### 2. **No Unsloth 4-bit Conversion Exists**
|
| 33 |
+
|
| 34 |
+
We searched the Unsloth model catalog thoroughly:
|
| 35 |
+
- **Unsloth HF namespace:** https://huggingface.co/unsloth
|
| 36 |
+
- **Search terms:** "bonsai", "prism", "ternary"
|
| 37 |
+
- **Result:** **ZERO** Bonsai models in the Unsloth catalog
|
| 38 |
+
|
| 39 |
+
There are **no** `unsloth-bnb-4bit` or `unsloth-gemma-4bit` style conversions for Bonsai because the 1-bit format is fundamentally different from the standard INT4/FP4 quantization that Unsloth and bitsandbytes use.
|
| 40 |
+
|
| 41 |
+
### 3. **Available Bonsai Variants on HF**
|
| 42 |
+
|
| 43 |
+
| Variant | Size | Fine-Tunable? | Notes |
|
| 44 |
+
|---------|------|---------------|-------|
|
| 45 |
+
| `prism-ml/Bonsai-1B` | ~1GB | β No | 1-bit weights, custom inference only |
|
| 46 |
+
| `prism-ml/Bonsai-8B` | ~1GB packed | β No | Same 1-bit format |
|
| 47 |
+
| `prism-ml/Bonsai-8B-unpacked` | ~15GB | β οΈ Maybe* | Qwen3 architecture, but weights may still be ternary |
|
| 48 |
+
|
| 49 |
+
*The "unpacked" variant lists `Qwen3ForCausalLM` in its config, but the actual weight tensors are still ternary-encoded. Standard `from_pretrained()` will fail or produce garbage because the weight files use a custom serialization format.
|
| 50 |
+
|
| 51 |
+
### 4. **PrismML's Training Stack**
|
| 52 |
+
|
| 53 |
+
PrismML has not (as of May 2026) released:
|
| 54 |
+
- An open-source fine-tuning framework for Bonsai
|
| 55 |
+
- A conversion tool from 1-bit β standard FP16
|
| 56 |
+
- LoRA adapter support
|
| 57 |
+
- Integration with Hugging Face TRL, PEFT, or Unsloth
|
| 58 |
+
|
| 59 |
+
The [Bonsai-demo](https://github.com/PrismML-Eng/Bonsai-demo) repository only shows **inference** examples, not training.
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## What ARE the Options for Extremely Lightweight Models?
|
| 64 |
+
|
| 65 |
+
If your goal is to fine-tune a very small model on T4 with minimal VRAM, these **are** supported by Unsloth:
|
| 66 |
+
|
| 67 |
+
| Model | Params | 4-bit Size | T4 Batch Size | Unsloth Support |
|
| 68 |
+
|-------|--------|-----------|---------------|-----------------|
|
| 69 |
+
| **LFM2.5-1.2B** | 1.2B | ~1GB | **8** | β
Excellent |
|
| 70 |
+
| **Qwen3.5-0.8B** | 0.8B | ~0.5GB | **8** | β
Excellent |
|
| 71 |
+
| **Qwen3.5-2B** | 2B | ~1.2GB | **4-8** | β
Excellent |
|
| 72 |
+
| **Gemma-4 E2B** | ~2B dense | ~7.6GB | **1** | β
Tight but works |
|
| 73 |
+
|
| 74 |
+
These models are **already** extremely small and can be fine-tuned with very large batch sizes on T4. They achieve similar or better compression-through-performance ratios than Bonsai, **with** full training support.
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## Future Possibility
|
| 79 |
+
|
| 80 |
+
If PrismML releases:
|
| 81 |
+
1. A **standard FP16/FP32 checkpoint** of Bonsai (even if larger)
|
| 82 |
+
2. Or a **Bonsai β standard format converter**
|
| 83 |
+
3. Or adds Bonsai to the Unsloth model catalog
|
| 84 |
+
|
| 85 |
+
...then we can create a notebook. Until then, **Bonsai fine-tuning on Unsloth/TRL/PEFT is not possible**.
|
| 86 |
+
|
| 87 |
+
---
|
| 88 |
+
|
| 89 |
+
## Sources
|
| 90 |
+
|
| 91 |
+
- PrismML Bonsai Collection: https://huggingface.co/collections/prism-ml/bonsai
|
| 92 |
+
- Bonsai Demo (inference only): https://github.com/PrismML-Eng/Bonsai-demo
|
| 93 |
+
- Unsloth Model Catalog: https://unsloth.ai/docs/get-started/unsloth-model-catalog
|
| 94 |
+
- PrismML Blog (1-bit ternary): https://byteiota.com/prismml-1-bit-bonsai-llm-14x-smaller-8x-faster/
|
| 95 |
+
|
| 96 |
+
---
|
| 97 |
+
|
| 98 |
+
*Last updated: May 2026*
|