helcig commited on
Commit
8bf8a2c
·
verified ·
1 Parent(s): 4a5537b

Fix broken folder links in README

Browse files
Files changed (1) hide show
  1. README.md +24 -24
README.md CHANGED
@@ -19,14 +19,14 @@ Eight variants, one per (sparsity × calibration × allocation):
19
 
20
  | Sparsity | Calibration | Allocation | Folder |
21
  |---|---|---|---|
22
- | 25% | coding | uniform | [`coding-25-uniform/`](./coding-25-uniform) |
23
- | 25% | coding | nonuniform | [`coding-25-nonuniform/`](./coding-25-nonuniform) |
24
- | 50% | coding | uniform | [`coding-50-uniform/`](./coding-50-uniform) |
25
- | 50% | coding | nonuniform | [`coding-50-nonuniform/`](./coding-50-nonuniform) |
26
- | 25% | general | uniform | [`general-25-uniform/`](./general-25-uniform) |
27
- | 25% | general | nonuniform | [`general-25-nonuniform/`](./general-25-nonuniform) |
28
- | 50% | general | uniform | [`general-50-uniform/`](./general-50-uniform) |
29
- | 50% | general | nonuniform | [`general-50-nonuniform/`](./general-50-nonuniform) |
30
 
31
 
32
  ## What is expert pruning?
@@ -159,28 +159,28 @@ All evaluations run with vLLM (bf16, greedy decoding). Coding benchmarks: HumanE
159
  | Variant | Size | HumanEval | rec. | MBPP | rec. |
160
  |---|---|---|---|---|---|
161
  | **Full model** | **159 GB** | **0.744** | n/a | **0.764** | n/a |
162
- | [coding-25-uniform](./coding-25-uniform) | 121 GB | 0.683 | 92% | **0.688** | **90%** |
163
- | [coding-25-nonuniform](./coding-25-nonuniform) | 121 GB | **0.744** | **100%** | 0.678 | 89% |
164
- | [coding-50-uniform](./coding-50-uniform) | 82 GB | 0.409 | 55% | 0.534 | 70% |
165
- | [coding-50-nonuniform](./coding-50-nonuniform) | 82 GB | **0.720** | **97%** | **0.690** | **90%** |
166
- | [general-25-uniform](./general-25-uniform) | 121 GB | 0.043 | 6% | 0.046 | 6% |
167
- | [general-25-nonuniform](./general-25-nonuniform) | 121 GB | 0.061 | 8% | 0.058 | 8% |
168
- | [general-50-uniform](./general-50-uniform) | 82 GB | 0.000 | 0% | 0.018 | 2% |
169
- | [general-50-nonuniform](./general-50-nonuniform) | 82 GB | 0.012 | 2% | 0.010 | 1% |
170
 
171
  ### General benchmarks (MC-8)
172
 
173
  | Variant | MC-8 avg | rec. | ARC-C | ARC-E | BoolQ | HSwag | MMLU | OBQA | RTE | WinoG |
174
  |---|---|---|---|---|---|---|---|---|---|---|
175
  | **Full model** | **0.714** | n/a | 0.606 | 0.821 | 0.885 | 0.775 | 0.767 | 0.430 | 0.765 | 0.666 |
176
- | [coding-25-uniform](./coding-25-uniform) | 0.656 | 92% | 0.501 | 0.722 | 0.864 | 0.690 | 0.710 | 0.380 | 0.729 | 0.655 |
177
- | [coding-25-nonuniform](./coding-25-nonuniform) | 0.638 | 89% | 0.462 | 0.662 | 0.851 | 0.665 | 0.680 | 0.362 | **0.776** | 0.642 |
178
- | [coding-50-uniform](./coding-50-uniform) | 0.577 | 81% | 0.403 | 0.641 | 0.789 | 0.578 | 0.564 | 0.350 | 0.671 | 0.616 |
179
- | [coding-50-nonuniform](./coding-50-nonuniform) | 0.546 | 76% | 0.356 | 0.555 | 0.776 | 0.548 | 0.543 | 0.340 | 0.646 | 0.603 |
180
- | [general-25-uniform](./general-25-uniform) | 0.707 | 99% | 0.600 | 0.807 | 0.876 | **0.785** | 0.704 | **0.452** | 0.751 | 0.677 |
181
- | [general-25-nonuniform](./general-25-nonuniform) | **0.714** | **100%** | **0.618** | **0.822** | **0.882** | 0.776 | **0.712** | 0.442 | 0.762 | **0.699** |
182
- | [general-50-uniform](./general-50-uniform) | **0.654** | **92%** | **0.541** | **0.771** | 0.839 | **0.709** | **0.610** | **0.428** | **0.675** | **0.658** |
183
- | [general-50-nonuniform](./general-50-nonuniform) | 0.644 | 90% | 0.526 | 0.762 | **0.842** | 0.708 | 0.595 | 0.414 | **0.675** | 0.635 |
184
 
185
  ### Key takeaways
186
 
 
19
 
20
  | Sparsity | Calibration | Allocation | Folder |
21
  |---|---|---|---|
22
+ | 25% | coding | uniform | [`coding-25-uniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-uniform) |
23
+ | 25% | coding | nonuniform | [`coding-25-nonuniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-nonuniform) |
24
+ | 50% | coding | uniform | [`coding-50-uniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-uniform) |
25
+ | 50% | coding | nonuniform | [`coding-50-nonuniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-nonuniform) |
26
+ | 25% | general | uniform | [`general-25-uniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-uniform) |
27
+ | 25% | general | nonuniform | [`general-25-nonuniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-nonuniform) |
28
+ | 50% | general | uniform | [`general-50-uniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-uniform) |
29
+ | 50% | general | nonuniform | [`general-50-nonuniform/`](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-nonuniform) |
30
 
31
 
32
  ## What is expert pruning?
 
159
  | Variant | Size | HumanEval | rec. | MBPP | rec. |
160
  |---|---|---|---|---|---|
161
  | **Full model** | **159 GB** | **0.744** | n/a | **0.764** | n/a |
162
+ | [coding-25-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-uniform) | 121 GB | 0.683 | 92% | **0.688** | **90%** |
163
+ | [coding-25-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-nonuniform) | 121 GB | **0.744** | **100%** | 0.678 | 89% |
164
+ | [coding-50-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-uniform) | 82 GB | 0.409 | 55% | 0.534 | 70% |
165
+ | [coding-50-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-nonuniform) | 82 GB | **0.720** | **97%** | **0.690** | **90%** |
166
+ | [general-25-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-uniform) | 121 GB | 0.043 | 6% | 0.046 | 6% |
167
+ | [general-25-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-nonuniform) | 121 GB | 0.061 | 8% | 0.058 | 8% |
168
+ | [general-50-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-uniform) | 82 GB | 0.000 | 0% | 0.018 | 2% |
169
+ | [general-50-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-nonuniform) | 82 GB | 0.012 | 2% | 0.010 | 1% |
170
 
171
  ### General benchmarks (MC-8)
172
 
173
  | Variant | MC-8 avg | rec. | ARC-C | ARC-E | BoolQ | HSwag | MMLU | OBQA | RTE | WinoG |
174
  |---|---|---|---|---|---|---|---|---|---|---|
175
  | **Full model** | **0.714** | n/a | 0.606 | 0.821 | 0.885 | 0.775 | 0.767 | 0.430 | 0.765 | 0.666 |
176
+ | [coding-25-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-uniform) | 0.656 | 92% | 0.501 | 0.722 | 0.864 | 0.690 | 0.710 | 0.380 | 0.729 | 0.655 |
177
+ | [coding-25-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-25-nonuniform) | 0.638 | 89% | 0.462 | 0.662 | 0.851 | 0.665 | 0.680 | 0.362 | **0.776** | 0.642 |
178
+ | [coding-50-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-uniform) | 0.577 | 81% | 0.403 | 0.641 | 0.789 | 0.578 | 0.564 | 0.350 | 0.671 | 0.616 |
179
+ | [coding-50-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/coding-50-nonuniform) | 0.546 | 76% | 0.356 | 0.555 | 0.776 | 0.548 | 0.543 | 0.340 | 0.646 | 0.603 |
180
+ | [general-25-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-uniform) | 0.707 | 99% | 0.600 | 0.807 | 0.876 | **0.785** | 0.704 | **0.452** | 0.751 | 0.677 |
181
+ | [general-25-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-25-nonuniform) | **0.714** | **100%** | **0.618** | **0.822** | **0.882** | 0.776 | **0.712** | 0.442 | 0.762 | **0.699** |
182
+ | [general-50-uniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-uniform) | **0.654** | **92%** | **0.541** | **0.771** | 0.839 | **0.709** | **0.610** | **0.428** | **0.675** | **0.658** |
183
+ | [general-50-nonuniform](https://huggingface.co/ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned/tree/main/general-50-nonuniform) | 0.644 | 90% | 0.526 | 0.762 | **0.842** | 0.708 | 0.595 | 0.414 | **0.675** | 0.635 |
184
 
185
  ### Key takeaways
186