Upload MODEL_CARD.md
Browse files- MODEL_CARD.md +68 -0
MODEL_CARD.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: q-tensorformer
|
| 4 |
+
tags:
|
| 5 |
+
- tensor-networks
|
| 6 |
+
- quantum-machine-learning
|
| 7 |
+
- model-compression
|
| 8 |
+
- transformer
|
| 9 |
+
- efficient-deep-learning
|
| 10 |
+
- nisq
|
| 11 |
+
- pennylane
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Q-TensorFormer v3 — Model Card
|
| 16 |
+
|
| 17 |
+
## Model Details
|
| 18 |
+
|
| 19 |
+
**Q-TensorFormer** is a hybrid transformer that compresses feed-forward layers using **Tensor-Train (TT) decomposition** and enhances token representations via **PennyLane quantum circuits**, with **adaptive TT-rank scheduling** guided by attention entropy.
|
| 20 |
+
|
| 21 |
+
- **Architecture**: Quantum-Enhanced Tensor Network Transformer
|
| 22 |
+
- **Parameters**: Configurable (50K–50M range)
|
| 23 |
+
- **Compression ratio**: 1.5–3× vs. equivalent dense transformer
|
| 24 |
+
- **Quantum overhead**: <30% of tokens routed through quantum (adjustable sparsity)
|
| 25 |
+
|
| 26 |
+
## Core Mechanism
|
| 27 |
+
|
| 28 |
+
```
|
| 29 |
+
Attention entropy S(ρ) → norm → RankScheduler → TT-rank r(layer)
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
The attention entropy (a classical proxy for quantum entanglement) measures input complexity per token. Higher entropy → more complex patterns → higher tensor rank. Lower entropy → more compressible → aggressive TT rank reduction.
|
| 33 |
+
|
| 34 |
+
**Budget-constrained mode**: Set `max_params`, `max_latency_ms`, or `max_energy_per_query` and the model auto-adjusts ranks to stay within budget.
|
| 35 |
+
|
| 36 |
+
## Intended Uses
|
| 37 |
+
|
| 38 |
+
| Use Case | Model Size | Expected Metric |
|
| 39 |
+
|----------|-----------|----------------|
|
| 40 |
+
| Edge NLP (mobile, on-device) | <5M params | PPL within 5% of dense baseline |
|
| 41 |
+
| Enterprise model compression | 10–50M params | 2× param reduction at equal accuracy |
|
| 42 |
+
| Multilingual low-resource | <10M params | Better representation per parameter |
|
| 43 |
+
| Research: quantum-classical hybrid | Small | Demonstrate quantum value in NLP |
|
| 44 |
+
|
| 45 |
+
## Limitations
|
| 46 |
+
|
| 47 |
+
- **NISQ-era only**: Quantum circuits are simulated (PennyLane `default.qubit`). Real quantum hardware not required.
|
| 48 |
+
- **Small to medium models**: Designed for embedding dimensions ≤512. Not for GPT-scale (100M+) models.
|
| 49 |
+
- **Training data**: Optimized for WikiText-2 and similar text corpora.
|
| 50 |
+
- **Quantum advantage**: We claim efficiency (fewer params for same performance), not "quantum advantage" in the broad sense.
|
| 51 |
+
|
| 52 |
+
## Citation
|
| 53 |
+
|
| 54 |
+
```bibtex
|
| 55 |
+
@software{q_tensorformer2026,
|
| 56 |
+
author = {Premchan369},
|
| 57 |
+
title = {Q-TensorFormer: Quantum-Enhanced Tensor Network LLM Compression Engine},
|
| 58 |
+
url = {https://huggingface.co/Premchan369/q-tensorformer},
|
| 59 |
+
version = {3.0.0},
|
| 60 |
+
year = {2026},
|
| 61 |
+
}
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
## References
|
| 65 |
+
|
| 66 |
+
- Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
|
| 67 |
+
- Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
|
| 68 |
+
- PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
|