TinyLoRA-TexasHoldEm-Llama-3.2-1B-Instruct

Fine-tuned Llama 3.2 1B Instruct model for Texas Hold'em poker decisions with TinyLoRA. The adapter size is 470KB!

Training

  • Base model: meta-llama/Llama-3.2-1B-Instruct
  • Dataset: RZ412/PokerBench
  • Method: TinyLoRA fine-tuning with unsloth-mlx
  • LoRA config: r=256 (svd), u=1024, target_modules=[q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
  • Training data: 50k preflop + 50k postflop samples

Performance

Evaluated on PokerBench test sets:

Preflop Postflop
Base Model 7% 13%
Fine-tuned 36% 60%
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for neopolita/TinyLoRA-TexasHoldEm-Llama-3.2-1B-Instruct

Adapter
(602)
this model

Paper for neopolita/TinyLoRA-TexasHoldEm-Llama-3.2-1B-Instruct