license: apache-2.0
library_name: q-tensorformer
tags:
- tensor-networks
- quantum-machine-learning
- model-compression
- transformer
- efficient-deep-learning
- nisq
- pennylane
- k2-think
- explainable-ai
pipeline_tag: text-generation
Q-TensorFormer v3 — Model Card
Model Details
Q-TensorFormer is a hybrid transformer that compresses feed-forward layers using Tensor-Train (TT) decomposition and enhances token representations via PennyLane quantum circuits, with adaptive TT-rank scheduling guided by attention entropy.
- Architecture: Quantum-Enhanced Tensor Network Transformer
- Parameters: Configurable (50K–50M range)
- Compression ratio: 1.5–3× vs. equivalent dense transformer
- Quantum overhead: <30% of tokens routed through quantum (adjustable sparsity)
- K2 Think v2 Integration: Explainable AI for every compression and routing decision
Core Mechanism
Attention entropy S(ρ) → norm → RankScheduler → TT-rank r(layer)
The attention entropy (a classical proxy for quantum entanglement) measures input complexity per token. Higher entropy → more complex patterns → higher tensor rank. Lower entropy → more compressible → aggressive TT rank reduction.
Budget-constrained mode: Set max_params, max_latency_ms, or max_energy_per_query and the model auto-adjusts ranks to stay within budget.
K2 Think v2 Integration (Explainable AI)
Q-TensorFormer integrates with K2 Think v2 (MBZUAI-IFM/K2-Think-v2) to provide natural language explanations for every compression and routing decision:
| Component | What K2 Think Explains |
|---|---|
| RankScheduler | Why entropy X → rank Y ("Token 47 has high attention dispersion, needs more capacity") |
| QuantumRouter | Why token went to quantum ("This embedding is near decision boundary, quantum feature map may help") |
| Budget Tracker | How budget constraints affected model size ("Reduced rank to 4 to stay under 2M params") |
| Compression Report | Full audit trail of per-layer, per-token compression choices |
Live Demo: AlphaForge x K2 Think V2
Intended Uses
| Use Case | Model Size | Expected Metric |
|---|---|---|
| Edge NLP (mobile, on-device) | <5M params | PPL within 5% of dense baseline |
| Enterprise model compression | 10–50M params | 2× param reduction at equal accuracy |
| Multilingual low-resource | <10M params | Better representation per parameter |
| Research: quantum-classical hybrid | Small | Demonstrate quantum value in NLP |
| Financial NLP (with K2 Think) | Any | Explainable compression for regulated industries |
Limitations
- NISQ-era only: Quantum circuits are simulated (PennyLane
default.qubit). Real quantum hardware not required. - Small to medium models: Designed for embedding dimensions ≤512. Not for GPT-scale (100M+) models.
- Training data: Optimized for WikiText-2 and similar text corpora.
- Quantum advantage: We claim efficiency (fewer params for same performance), not "quantum advantage" in the broad sense.
Citation
@software{q_tensorformer2026,
author = {Premchan369},
title = {Q-TensorFormer: Quantum-Enhanced Tensor Network LLM Compression Engine},
url = {https://huggingface.co/Premchan369/q-tensorformer},
version = {3.0.0},
year = {2026},
}
References
- Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
- Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
- PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
- K2 Think v2: MBZUAI-IFM/K2-Think-v2, Build with K2 Think V2 Challenge
Related Projects
- AlphaForge x K2 Think V2 — Live quant trading demo with K2 Think v2 reasoning
- AlphaForge Platform — 25-module open-source quant system