# Speculative Tool Actions Investigating whether speculative decoding can be adapted from token prediction to agent action prediction. **Current state:** v2 evaluation complete (see [ABLATION_REPORT_v2.md](./ABLATION_REPORT_v2.md)). v3 datasets + 1.7B proposer trained. **Need:** train 4B verifier + 8B proposer, then run eval. ## Quick Start: Complete the Project ### One-command training (A100-large, ~2h): ```bash python train_all_v3.py ``` Or via HF Jobs: ```python hf_jobs(operation="run", script="https://hf.co/narcolepticchicken/speculative-tool-actions/resolve/main/train_all_v3.py", dependencies=["transformers>=4.51","trl","torch","datasets","accelerate","peft","huggingface_hub"], hardware_flavor="a100-large", timeout="12h") ``` ### Then evaluate: ```bash python eval_runner_v3.py ``` ## Architecture A cheap model (Qwen3-1.7B LoRA) proposes the next agent action. A verifier (Qwen3-4B LoRA) accepts or rejects. On rejection, fall back to the expensive 8B model. **Action space:** `tool_call`, `retrieval`, `file_read`, `file_write`, `repair`, `verifier`, `ask_clarification`, `final_answer`, `BLOCKED` ## Files | File | Purpose | |------|---------| | `train_all_v3.py` | Consolidated: trains 1.7B+4B+8B sequentially | | `train_sft_v3.py` | Individual proposer training | | `train_verifier_v3.py` | Individual verifier training | | `eval_runner_v3.py` | All-5-configs evaluation | | `PROJECT_REPORT.md` | Full project documentation + v2 results | | `ABLATION_REPORT_v2.md` | v2 analysis (51% cheap vs 40% frozen 8B) | | `eval_results_v2.json` | v2 raw results | ## v2 Results | Config | Acc | Cost | |--------|-----|------| | A: 8B frozen | 40% | 1.00 | | B: 1.7B cheap | **51%** | **0.15** | | D: cheap + 4B RM | 51% | 0.25 | | E: multi-proposal | 42% | 0.75 | See [ABLATION_REPORT_v2.md](./ABLATION_REPORT_v2.md) for analysis. ## v3 Status | Component | Status | |-----------|--------| | Datasets (SFT, verifier, eval) | ✓ Built | | 1.7B proposer | ✓ Trained | | 4B verifier | ✗ Needs training | | 8B proposer | ✗ Needs training | | Eval runner | ✓ Ready |