PentaNet β€” Native Pentanary Quantization for LLMs

Author: Zorko Β· Independent Researcher Β· zorko.xyz

PentaNet extends extreme quantization beyond BitNet's ternary {-1, 0, +1} to pentanary {-2, -1, 0, +1, +2}, achieving a 6.4% perplexity improvement on WikiText-103 while preserving zero-multiplier inference (additions + bit-shifts only).

Key Results

Model Mean PPL Std Seeds
PentaNet {-2..+2} 180.32 Β±2.09 42, 1337, 2026
BitNet {-1..+1} 192.63 Β±3.52 42, 1337, 2026
  • 124M parameter GPT-2-style transformer
  • WikiText-103 (~100M tokens)
  • Trained on a single RTX 5080 (16 GB)
  • No collapse: Β±2 buckets maintain ~11% occupancy through all 10k iterations

Text Generation Example (124M params, 20min training)

(Prompt: "The history of the internet began with")

⏳ Generating with BitNet (Ternary {-1, 0, 1}) ...
πŸ€– BITNET S42: The history of the internet began with the <unk> to be a way , <unk> , which was the first recent of the <unk> , and the city and the <unk> . The French army was the first to be the first @-@ scale

⏳ Generating with PentaNet (Pentanary {-2, -1, 0, 1, 2}) ...
πŸ€– PENTANET S42: The history of the internet began with the original level of the other . The term of the original world was to the public court of the United States in July 2013 in February 15 , 2015 , as well as the team of $ 2 @,@ 000 . In the same year , the

Notice how BitNet struggles with vocabulary collapse (<unk>) and repetitive stuttering, while PentaNet generates fluent, grammatically correct Wikipedia-style coherent sentences (despite being factually hallucinatory due to the small size).

Project Structure

β”œβ”€β”€ README.md
β”œβ”€β”€ PentaNet_NeurIPS_Draft.md       # Full technical report (markdown)
β”œβ”€β”€ train_pentagpt.py               # Core training script (PentaNet + BitNet)
β”œβ”€β”€ pentanet_layer.py               # PentaLinear layer implementation
β”œβ”€β”€ prepare_data.py                 # WikiText-103 data preparation
β”œβ”€β”€ run_benchmark.py                # 3-seed benchmark orchestrator
β”œβ”€β”€ paper/
β”‚   β”œβ”€β”€ PentaNet_Technical_Report.pdf
β”‚   └── figures/
β”œβ”€β”€ scripts/                        # Visualization & utilities
β”‚   β”œβ”€β”€ compile_pdf.py
β”‚   β”œβ”€β”€ export_figures.py
β”‚   β”œβ”€β”€ generate_dashboard.py
β”‚   └── plot_results.py
└── models/                         # JSON logs + model checkpoints
    β”œβ”€β”€ pentanet_large_s{42,1337,2026}_results.json
    └── bitnet_large_s{42,1337,2026}_results.json

Quick Start

# 1. Setup
python -m venv .venv-gpu && source .venv-gpu/bin/activate
pip install torch transformers datasets

# 2. Prepare data
python prepare_data.py

# 3. Run full benchmark (3 seeds Γ— 2 architectures, ~2h15 on RTX 5080)
python run_benchmark.py

# 4. Visualize results
python scripts/generate_dashboard.py   # Interactive HTML dashboard
python scripts/export_figures.py       # Publication-quality PNG/PDF
python scripts/compile_pdf.py          # Compile full paper PDF

Model Weights (HuggingFace)

Pre-trained checkpoints are available on HuggingFace:

πŸ€— kyworn/pentanet-124m

Citation

@techreport{zorko2026pentanet,
  title     = {PentaNet: Native Pentanary Quantization for Large Language Models},
  author    = {Zorko},
  year      = {2026},
  url       = {https://zorko.xyz}
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Kyworn/pentanet-124m

Evaluation results

  • Validation Perplexity (mean, 3 seeds) on WikiText-103
    self-reported
    180.320