Update READMD.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,8 @@ P-EAGLE is a parallel-drafting speculative decoding model that generates K draft
|
|
| 8 |
### Model Details
|
| 9 |
The model architecture is illustrated in the following figure. Specifically, we trained a 4-layer P-EAGLE for GPT-OSS 120B as the target model, with number of parallel-token prediction as 8.
|
| 10 |
|
|
|
|
|
|
|
| 11 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64ab5fe189aa67e4a251b6b4/UBBMgZvXkOduu_LpUunQy.png" width="50%">
|
| 12 |
|
| 13 |
### Model Description
|
|
|
|
| 8 |
### Model Details
|
| 9 |
The model architecture is illustrated in the following figure. Specifically, we trained a 4-layer P-EAGLE for GPT-OSS 120B as the target model, with number of parallel-token prediction as 8.
|
| 10 |
|
| 11 |
+
P-EAGLE follows the vanila EAGLE 3 using three layers of hidden states from the target model.
|
| 12 |
+
|
| 13 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64ab5fe189aa67e4a251b6b4/UBBMgZvXkOduu_LpUunQy.png" width="50%">
|
| 14 |
|
| 15 |
### Model Description
|