Safetensors
English
llama
Dogacel commited on
Commit
dad2340
·
verified ·
1 Parent(s): 4ddcb73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -3
README.md CHANGED
@@ -20,13 +20,15 @@ It has several minor architectural differences from the original EAGLE: Drafter
20
  ### Model Sources [optional]
21
 
22
  - **Repository:** [Dogacel/SpecDrift](https://github.com/Dogacel/SpecDrift)
23
- - **Paper [optional]:** TODO
24
 
25
  ## Uses
26
 
27
  We recommend using SGLang to run the model,
28
 
29
  ```
 
 
30
  python -m sglang.launch_server \
31
  --model-path openai/gpt-oss-20b \
32
  --speculative-algorithm EAGLE3 \
@@ -34,13 +36,14 @@ python -m sglang.launch_server \
34
  --speculative-num-steps 3 \
35
  --speculative-eagle-topk 1 \
36
  --speculative-num-draft-tokens 4 \
 
37
  --port 30000 \
38
  --dp-size 1 --tp-size 1 \
39
  --max-running-requests 64 \
40
  --cuda-graph-max-bs 64 \
41
  --attention-backend fa3 \
42
  --trust-remote-code \
43
- --mem-fraction-static 0.5 --dtype bfloat16
44
  ```
45
 
46
  ## Training Details
@@ -103,7 +106,17 @@ Our evaluation on higher batch sizes has shown the model performance matches or
103
 
104
  **BibTeX:**
105
 
106
- TODO
 
 
 
 
 
 
 
 
 
 
107
 
108
  ## Acknowledgements
109
 
 
20
  ### Model Sources [optional]
21
 
22
  - **Repository:** [Dogacel/SpecDrift](https://github.com/Dogacel/SpecDrift)
23
+ - **Paper:** https://arxiv.org/abs/2605.09992
24
 
25
  ## Uses
26
 
27
  We recommend using SGLang to run the model,
28
 
29
  ```
30
+ export SGLANG_ENABLE_SPEC_V2=1
31
+
32
  python -m sglang.launch_server \
33
  --model-path openai/gpt-oss-20b \
34
  --speculative-algorithm EAGLE3 \
 
36
  --speculative-num-steps 3 \
37
  --speculative-eagle-topk 1 \
38
  --speculative-num-draft-tokens 4 \
39
+ --speculative-draft-sliding-window 2048 \
40
  --port 30000 \
41
  --dp-size 1 --tp-size 1 \
42
  --max-running-requests 64 \
43
  --cuda-graph-max-bs 64 \
44
  --attention-backend fa3 \
45
  --trust-remote-code \
46
+ --mem-fraction-static 0.9 --dtype bfloat16
47
  ```
48
 
49
  ## Training Details
 
106
 
107
  **BibTeX:**
108
 
109
+ ```bibtex
110
+ @misc{eldenk2026attentiondrift,
111
+ title={Attention Drift: What Autoregressive Speculative Decoding Models Learn},
112
+ author={Doğaç Eldenk and Payal Mohapatra and Yigitcan Comlek and Kaan Oktay and Hongyang Zhang and Stephen Xia},
113
+ year={2026},
114
+ eprint={2605.09992},
115
+ archivePrefix={arXiv},
116
+ primaryClass={cs.LG},
117
+ url={https://arxiv.org/abs/2605.09992},
118
+ }
119
+ ```
120
 
121
  ## Acknowledgements
122