Spaces:
Sleeping
Sleeping
deploy via scripts/deploy_to_space.py
Browse files
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
emoji: 🩺
|
| 4 |
colorFrom: indigo
|
| 5 |
colorTo: pink
|
|
@@ -19,7 +19,7 @@ license: mit
|
|
| 19 |
short_description: OpenEnv RL env that teaches an LLM to decode quantum errors.
|
| 20 |
---
|
| 21 |
|
| 22 |
-
#
|
| 23 |
|
| 24 |
An LLM (Qwen2.5-3B-Instruct) learning to outperform a 50-year-old graph-matching algorithm (PyMatching) at decoding quantum surface-code syndromes — using verifiable physics rewards, not human preferences. DeepMind's AlphaQubit (*Nature* 2024, Bausch et al.) showed a transformer can beat strong classical decoders, but it cost Google millions of dollars and a custom architecture. We ship a 3B-parameter open model on a free Colab T4, trained with SFT + GRPO against a real Stim simulator behind an OpenEnv HTTP contract.
|
| 25 |
|
|
@@ -31,10 +31,10 @@ An LLM (Qwen2.5-3B-Instruct) learning to outperform a 50-year-old graph-matching
|
|
| 31 |
- **Trained LoRA on the Hub:** [ronitraj/quantumscribe](https://huggingface.co/ronitraj/quantumscribe)
|
| 32 |
- **Colab notebook (actual training run):** [`notebooks/meta_final.ipynb`](notebooks/meta_final.ipynb)
|
| 33 |
- **2-min video:** <!-- TODO: replace with submission video URL -->TBD-replace
|
| 34 |
-
- **Blog:**
|
| 35 |
- **W&B project:** [ronitraj/QuantumScribe-GRPO](https://wandb.ai/ronitraj/QuantumScribe-GRPO) · SFT [`yli513jl`](https://wandb.ai/ronitraj/QuantumScribe-GRPO/runs/yli513jl) · GRPO [`4p7eurnc`](https://wandb.ai/ronitraj/QuantumScribe-GRPO/runs/4p7eurnc)
|
| 36 |
- **OpenEnv manifest:** [`openenv.yaml`](openenv.yaml)
|
| 37 |
-
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
|
|
| 1 |
---
|
| 2 |
+
title: QuantumScribe
|
| 3 |
emoji: 🩺
|
| 4 |
colorFrom: indigo
|
| 5 |
colorTo: pink
|
|
|
|
| 19 |
short_description: OpenEnv RL env that teaches an LLM to decode quantum errors.
|
| 20 |
---
|
| 21 |
|
| 22 |
+
# QuantumScribe: An LLM Decoder for Quantum Error Correction
|
| 23 |
|
| 24 |
An LLM (Qwen2.5-3B-Instruct) learning to outperform a 50-year-old graph-matching algorithm (PyMatching) at decoding quantum surface-code syndromes — using verifiable physics rewards, not human preferences. DeepMind's AlphaQubit (*Nature* 2024, Bausch et al.) showed a transformer can beat strong classical decoders, but it cost Google millions of dollars and a custom architecture. We ship a 3B-parameter open model on a free Colab T4, trained with SFT + GRPO against a real Stim simulator behind an OpenEnv HTTP contract.
|
| 25 |
|
|
|
|
| 31 |
- **Trained LoRA on the Hub:** [ronitraj/quantumscribe](https://huggingface.co/ronitraj/quantumscribe)
|
| 32 |
- **Colab notebook (actual training run):** [`notebooks/meta_final.ipynb`](notebooks/meta_final.ipynb)
|
| 33 |
- **2-min video:** <!-- TODO: replace with submission video URL -->TBD-replace
|
| 34 |
+
- **Blog:** [`BLOG.md`](BLOG.md)
|
| 35 |
- **W&B project:** [ronitraj/QuantumScribe-GRPO](https://wandb.ai/ronitraj/QuantumScribe-GRPO) · SFT [`yli513jl`](https://wandb.ai/ronitraj/QuantumScribe-GRPO/runs/yli513jl) · GRPO [`4p7eurnc`](https://wandb.ai/ronitraj/QuantumScribe-GRPO/runs/4p7eurnc)
|
| 36 |
- **OpenEnv manifest:** [`openenv.yaml`](openenv.yaml)
|
| 37 |
+
|
| 38 |
|
| 39 |
---
|
| 40 |
|