Update MODEL_CARD: Add K2 Think v2 integration section, explainable AI features
Browse files- MODEL_CARD.md +23 -0
MODEL_CARD.md
CHANGED
|
@@ -9,6 +9,8 @@ tags:
|
|
| 9 |
- efficient-deep-learning
|
| 10 |
- nisq
|
| 11 |
- pennylane
|
|
|
|
|
|
|
| 12 |
pipeline_tag: text-generation
|
| 13 |
---
|
| 14 |
|
|
@@ -22,6 +24,7 @@ pipeline_tag: text-generation
|
|
| 22 |
- **Parameters**: Configurable (50K–50M range)
|
| 23 |
- **Compression ratio**: 1.5–3× vs. equivalent dense transformer
|
| 24 |
- **Quantum overhead**: <30% of tokens routed through quantum (adjustable sparsity)
|
|
|
|
| 25 |
|
| 26 |
## Core Mechanism
|
| 27 |
|
|
@@ -33,6 +36,19 @@ The attention entropy (a classical proxy for quantum entanglement) measures inpu
|
|
| 33 |
|
| 34 |
**Budget-constrained mode**: Set `max_params`, `max_latency_ms`, or `max_energy_per_query` and the model auto-adjusts ranks to stay within budget.
|
| 35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
## Intended Uses
|
| 37 |
|
| 38 |
| Use Case | Model Size | Expected Metric |
|
|
@@ -41,6 +57,7 @@ The attention entropy (a classical proxy for quantum entanglement) measures inpu
|
|
| 41 |
| Enterprise model compression | 10–50M params | 2× param reduction at equal accuracy |
|
| 42 |
| Multilingual low-resource | <10M params | Better representation per parameter |
|
| 43 |
| Research: quantum-classical hybrid | Small | Demonstrate quantum value in NLP |
|
|
|
|
| 44 |
|
| 45 |
## Limitations
|
| 46 |
|
|
@@ -66,3 +83,9 @@ The attention entropy (a classical proxy for quantum entanglement) measures inpu
|
|
| 66 |
- Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
|
| 67 |
- Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
|
| 68 |
- PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
- efficient-deep-learning
|
| 10 |
- nisq
|
| 11 |
- pennylane
|
| 12 |
+
- k2-think
|
| 13 |
+
- explainable-ai
|
| 14 |
pipeline_tag: text-generation
|
| 15 |
---
|
| 16 |
|
|
|
|
| 24 |
- **Parameters**: Configurable (50K–50M range)
|
| 25 |
- **Compression ratio**: 1.5–3× vs. equivalent dense transformer
|
| 26 |
- **Quantum overhead**: <30% of tokens routed through quantum (adjustable sparsity)
|
| 27 |
+
- **K2 Think v2 Integration**: Explainable AI for every compression and routing decision
|
| 28 |
|
| 29 |
## Core Mechanism
|
| 30 |
|
|
|
|
| 36 |
|
| 37 |
**Budget-constrained mode**: Set `max_params`, `max_latency_ms`, or `max_energy_per_query` and the model auto-adjusts ranks to stay within budget.
|
| 38 |
|
| 39 |
+
## K2 Think v2 Integration (Explainable AI)
|
| 40 |
+
|
| 41 |
+
Q-TensorFormer integrates with **K2 Think v2** (MBZUAI-IFM/K2-Think-v2) to provide natural language explanations for every compression and routing decision:
|
| 42 |
+
|
| 43 |
+
| Component | What K2 Think Explains |
|
| 44 |
+
|-----------|----------------------|
|
| 45 |
+
| **RankScheduler** | Why entropy X → rank Y ("Token 47 has high attention dispersion, needs more capacity") |
|
| 46 |
+
| **QuantumRouter** | Why token went to quantum ("This embedding is near decision boundary, quantum feature map may help") |
|
| 47 |
+
| **Budget Tracker** | How budget constraints affected model size ("Reduced rank to 4 to stay under 2M params") |
|
| 48 |
+
| **Compression Report** | Full audit trail of per-layer, per-token compression choices |
|
| 49 |
+
|
| 50 |
+
**Live Demo**: [AlphaForge x K2 Think V2](https://huggingface.co/spaces/Premchan369/alphaforge-k2think)
|
| 51 |
+
|
| 52 |
## Intended Uses
|
| 53 |
|
| 54 |
| Use Case | Model Size | Expected Metric |
|
|
|
|
| 57 |
| Enterprise model compression | 10–50M params | 2× param reduction at equal accuracy |
|
| 58 |
| Multilingual low-resource | <10M params | Better representation per parameter |
|
| 59 |
| Research: quantum-classical hybrid | Small | Demonstrate quantum value in NLP |
|
| 60 |
+
| Financial NLP (with K2 Think) | Any | Explainable compression for regulated industries |
|
| 61 |
|
| 62 |
## Limitations
|
| 63 |
|
|
|
|
| 83 |
- Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
|
| 84 |
- Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
|
| 85 |
- PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
|
| 86 |
+
- K2 Think v2: MBZUAI-IFM/K2-Think-v2, Build with K2 Think V2 Challenge
|
| 87 |
+
|
| 88 |
+
## Related Projects
|
| 89 |
+
|
| 90 |
+
- [AlphaForge x K2 Think V2](https://huggingface.co/spaces/Premchan369/alphaforge-k2think) — Live quant trading demo with K2 Think v2 reasoning
|
| 91 |
+
- [AlphaForge Platform](https://huggingface.co/Premchan369/alphaforge-quant-system) — 25-module open-source quant system
|