TexasHoldEm-Llama-3.2-1B-Instruct

Fine-tuned Llama 3.2 1B Instruct model for Texas Hold'em poker decisions.

Training

  • Base model: meta-llama/Llama-3.2-1B-Instruct
  • Dataset: RZ412/PokerBench
  • Method: LoRA fine-tuning with unsloth-mlx
  • LoRA config: r=32, alpha=64, target_modules=[q_proj, k_proj, v_proj, o_proj]
  • Training data: 60k preflop + 50k postflop samples

Performance

Evaluated on PokerBench test sets:

Preflop Postflop
Base Model 7% 13%
Fine-tuned 83% 55%
Improvement +76% +42%
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for neopolita/TexasHoldEm-Llama-3.2-1B-Instruct

Adapter
(602)
this model
Adapters
2 models
Merges
1 model