QKSAN: A Quantum Kernel Self-Attention Network
Paper • 2308.13422 • Published
Q-TensorFormer is a hybrid quantum-tensor model that adaptively compresses itself using entanglement entropy, achieving major efficiency gains with minimal performance loss.
Claim: 50-70% parameter reduction with same accuracy ± small drop, fewer compute ops / latency.
Tensor Compression (Efficiency)
Quantum Feature Encoding (Expressivity)
Entanglement-Guided Rank Adaptation (Novelty)
r = r_min + α · S(ρ) — tensor ranks adjust based on quantum state entropyTTFactorizedLinear: Tensor-Train compressed linear layersQuantumFeatureEncoder: PennyLane angle encoding with TorchLayerQuantumKernelAttention: Quantum kernel self-attention (QKSAN-style)SelectiveQuantumRouter: Only "hard" tokens go to quantum circuitRankScheduler: Entanglement-guided dynamic rank adjustment| Metric | Baseline | Q-TensorFormer | Reduction |
|---|---|---|---|
| Parameters | 10,764,288 | 1,325,102 | 8.12x |
| Memory (MB) | ~42 MB | ~5 MB | 8.12x |
| Compression | 1.00x | 8.12x | ✓ |
from qtensorformer import QTensorFormer, ModelConfig
config = ModelConfig(
vocab_size=10000,
hidden_dim=128,
n_layers=3,
tt_rank=4,
n_qubits=4,
use_quantum_attention=True,
use_adaptive_rank=True,
)
model = QTensorFormer(config)
logits, loss, stats = model(input_ids, labels=labels)
@misc{qtensorformer2025,
title={Q-TensorFormer: Quantum-Enhanced Tensor Network LLM Compression},
author={Q-TensorFormer Team},
year={2025},
note={Hybrid quantum-tensor model with entanglement-guided compression}
}
| Metric | Baseline (Dense) | Q-TensorFormer |
|---|---|---|
| Parameters | 1,554,570 | 793,882 |
| Compression | 1.00x | 2.0x |
| BlockTT Active | — | ✓ |
| Adaptive Rank Range | — | 2–3 (mean: 3.0) |
| Entanglement Range | — | 0.855–1.666 |
| Quantum Routing Savings | — | 80% |
The model uses K2 Think (MBZUAI-IFM/K2-Think-v2) to generate natural language explanations for every compression and routing decision, making tensor network compression transparent and auditable.
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.