File size: 4,199 Bytes
c6c70d3
 
 
 
 
 
 
 
 
 
 
e592171
 
c6c70d3
 
 
 
 
 
 
 
 
 
 
 
 
e592171
c6c70d3
 
 
 
 
 
 
 
 
 
 
e592171
 
 
 
 
 
 
 
 
 
 
 
 
c6c70d3
 
 
 
 
 
 
 
e592171
c6c70d3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e592171
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
license: apache-2.0
library_name: q-tensorformer
tags:
  - tensor-networks
  - quantum-machine-learning
  - model-compression
  - transformer
  - efficient-deep-learning
  - nisq
  - pennylane
  - k2-think
  - explainable-ai
pipeline_tag: text-generation
---

# Q-TensorFormer v3 — Model Card

## Model Details

**Q-TensorFormer** is a hybrid transformer that compresses feed-forward layers using **Tensor-Train (TT) decomposition** and enhances token representations via **PennyLane quantum circuits**, with **adaptive TT-rank scheduling** guided by attention entropy.

- **Architecture**: Quantum-Enhanced Tensor Network Transformer
- **Parameters**: Configurable (50K–50M range)
- **Compression ratio**: 1.5–3× vs. equivalent dense transformer
- **Quantum overhead**: <30% of tokens routed through quantum (adjustable sparsity)
- **K2 Think v2 Integration**: Explainable AI for every compression and routing decision

## Core Mechanism

```
Attention entropy S(ρ) → norm → RankScheduler → TT-rank r(layer)
```

The attention entropy (a classical proxy for quantum entanglement) measures input complexity per token. Higher entropy → more complex patterns → higher tensor rank. Lower entropy → more compressible → aggressive TT rank reduction.

**Budget-constrained mode**: Set `max_params`, `max_latency_ms`, or `max_energy_per_query` and the model auto-adjusts ranks to stay within budget.

## K2 Think v2 Integration (Explainable AI)

Q-TensorFormer integrates with **K2 Think v2** (MBZUAI-IFM/K2-Think-v2) to provide natural language explanations for every compression and routing decision:

| Component | What K2 Think Explains |
|-----------|----------------------|
| **RankScheduler** | Why entropy X → rank Y ("Token 47 has high attention dispersion, needs more capacity") |
| **QuantumRouter** | Why token went to quantum ("This embedding is near decision boundary, quantum feature map may help") |
| **Budget Tracker** | How budget constraints affected model size ("Reduced rank to 4 to stay under 2M params") |
| **Compression Report** | Full audit trail of per-layer, per-token compression choices |

**Live Demo**: [AlphaForge x K2 Think V2](https://huggingface.co/spaces/Premchan369/alphaforge-k2think)

## Intended Uses

| Use Case | Model Size | Expected Metric |
|----------|-----------|----------------|
| Edge NLP (mobile, on-device) | <5M params | PPL within 5% of dense baseline |
| Enterprise model compression | 10–50M params | 2× param reduction at equal accuracy |
| Multilingual low-resource | <10M params | Better representation per parameter |
| Research: quantum-classical hybrid | Small | Demonstrate quantum value in NLP |
| Financial NLP (with K2 Think) | Any | Explainable compression for regulated industries |

## Limitations

- **NISQ-era only**: Quantum circuits are simulated (PennyLane `default.qubit`). Real quantum hardware not required.
- **Small to medium models**: Designed for embedding dimensions ≤512. Not for GPT-scale (100M+) models.
- **Training data**: Optimized for WikiText-2 and similar text corpora.
- **Quantum advantage**: We claim efficiency (fewer params for same performance), not "quantum advantage" in the broad sense.

## Citation

```bibtex
@software{q_tensorformer2026,
  author = {Premchan369},
  title = {Q-TensorFormer: Quantum-Enhanced Tensor Network LLM Compression Engine},
  url = {https://huggingface.co/Premchan369/q-tensorformer},
  version = {3.0.0},
  year = {2026},
}
```

## References

- Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
- Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
- PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
- K2 Think v2: MBZUAI-IFM/K2-Think-v2, Build with K2 Think V2 Challenge

## Related Projects

- [AlphaForge x K2 Think V2](https://huggingface.co/spaces/Premchan369/alphaforge-k2think) — Live quant trading demo with K2 Think v2 reasoning
- [AlphaForge Platform](https://huggingface.co/Premchan369/alphaforge-quant-system) — 25-module open-source quant system