Update README.md
Browse files
README.md
CHANGED
|
@@ -151,7 +151,7 @@ def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) -> set[str]:
|
|
| 151 |
continue # skip spec decode layers for main model
|
| 152 |
...
|
| 153 |
```
|
| 154 |
-
Finally, modify `vllm/model_executor/layers/quantization/quark/quark_moe.py` by forcing `self.emulate` to "True":
|
| 155 |
```
|
| 156 |
class QuarkOCP_MX_MoEMethod(QuarkMoEMethod):
|
| 157 |
def __init__(...):
|
|
|
|
| 151 |
continue # skip spec decode layers for main model
|
| 152 |
...
|
| 153 |
```
|
| 154 |
+
Finally, modify `vllm/model_executor/layers/quantization/quark/quark_moe.py` by forcing `self.emulate` to "True" ([alternate resolution](https://github.com/vllm-project/vllm/pull/39436)):
|
| 155 |
```
|
| 156 |
class QuarkOCP_MX_MoEMethod(QuarkMoEMethod):
|
| 157 |
def __init__(...):
|