Update MODEL_CARD: Add K2 Think v2 integration section, explainable AI features

e592171 verified about 12 hours ago

4.2 kB

license: apache-2.0
library_name: q-tensorformer
tags:
  - tensor-networks
  - quantum-machine-learning
  - model-compression
  - transformer
  - efficient-deep-learning
  - nisq
  - pennylane
  - k2-think
  - explainable-ai
pipeline_tag: text-generation

Q-TensorFormer v3 — Model Card

Model Details

Q-TensorFormer is a hybrid transformer that compresses feed-forward layers using Tensor-Train (TT) decomposition and enhances token representations via PennyLane quantum circuits, with adaptive TT-rank scheduling guided by attention entropy.

Architecture: Quantum-Enhanced Tensor Network Transformer
Parameters: Configurable (50K–50M range)
Compression ratio: 1.5–3× vs. equivalent dense transformer
Quantum overhead: <30% of tokens routed through quantum (adjustable sparsity)
K2 Think v2 Integration: Explainable AI for every compression and routing decision

Core Mechanism

Attention entropy S(ρ) → norm → RankScheduler → TT-rank r(layer)

The attention entropy (a classical proxy for quantum entanglement) measures input complexity per token. Higher entropy → more complex patterns → higher tensor rank. Lower entropy → more compressible → aggressive TT rank reduction.

Budget-constrained mode: Set max_params, max_latency_ms, or max_energy_per_query and the model auto-adjusts ranks to stay within budget.

K2 Think v2 Integration (Explainable AI)

Q-TensorFormer integrates with K2 Think v2 (MBZUAI-IFM/K2-Think-v2) to provide natural language explanations for every compression and routing decision:

Component	What K2 Think Explains
RankScheduler	Why entropy X → rank Y ("Token 47 has high attention dispersion, needs more capacity")
QuantumRouter	Why token went to quantum ("This embedding is near decision boundary, quantum feature map may help")
Budget Tracker	How budget constraints affected model size ("Reduced rank to 4 to stay under 2M params")
Compression Report	Full audit trail of per-layer, per-token compression choices

Live Demo: AlphaForge x K2 Think V2

Intended Uses

Use Case	Model Size	Expected Metric
Edge NLP (mobile, on-device)	<5M params	PPL within 5% of dense baseline
Enterprise model compression	10–50M params	2× param reduction at equal accuracy
Multilingual low-resource	<10M params	Better representation per parameter
Research: quantum-classical hybrid	Small	Demonstrate quantum value in NLP
Financial NLP (with K2 Think)	Any	Explainable compression for regulated industries

Limitations

NISQ-era only: Quantum circuits are simulated (PennyLane default.qubit). Real quantum hardware not required.
Small to medium models: Designed for embedding dimensions ≤512. Not for GPT-scale (100M+) models.
Training data: Optimized for WikiText-2 and similar text corpora.
Quantum advantage: We claim efficiency (fewer params for same performance), not "quantum advantage" in the broad sense.

Citation

@software{q_tensorformer2026,
  author = {Premchan369},
  title = {Q-TensorFormer: Quantum-Enhanced Tensor Network LLM Compression Engine},
  url = {https://huggingface.co/Premchan369/q-tensorformer},
  version = {3.0.0},
  year = {2026},
}

References

Tensor Networks: Cichocki et al., "Tensor Networks for Dimensionality Reduction and Large-scale Optimization" (arXiv:2007.02779)
Quantum Transformers: Quixer (arXiv:2406.04305), QKSAN (arXiv:2308.13422)
PennyLane: Bergholm et al., "PennyLane: Automatic differentiation of hybrid quantum-classical computations" (arXiv:1811.04968)
K2 Think v2: MBZUAI-IFM/K2-Think-v2, Build with K2 Think V2 Challenge

Related Projects

AlphaForge x K2 Think V2 — Live quant trading demo with K2 Think v2 reasoning
AlphaForge Platform — 25-module open-source quant system