Learning to Reason in 13 Parameters
Paper • 2602.04118 • Published • 6
Fine-tuned Llama 3.2 1B Instruct model for Texas Hold'em poker decisions with TinyLoRA. The adapter size is 470KB!
Evaluated on PokerBench test sets:
| Preflop | Postflop | |
|---|---|---|
| Base Model | 7% | 13% |
| Fine-tuned | 36% | 60% |
Base model
meta-llama/Llama-3.2-1B-Instruct