sampling
Browse files
README.md
CHANGED
|
@@ -81,7 +81,7 @@ vllm serve Zyphra/ZAYA1-8B --port 8010 \
|
|
| 81 |
```
|
| 82 |
For parallel deployment we recommend using DP with EP as TP for CCA is not supported in the branch above. If running on 8 GPUs, set extra flags `-dp 8 -ep` to run with DP=EP=8.
|
| 83 |
|
| 84 |
-
For our evaluations and for general use, we recommend temperature 1.0, top-p 0.95, top-k -1. For agent and code use cases, we recommend top-p 0.
|
| 85 |
|
| 86 |
Once the server is up, you can query a model with `curl` like in the following example:
|
| 87 |
```bash
|
|
|
|
| 81 |
```
|
| 82 |
For parallel deployment we recommend using DP with EP as TP for CCA is not supported in the branch above. If running on 8 GPUs, set extra flags `-dp 8 -ep` to run with DP=EP=8.
|
| 83 |
|
| 84 |
+
For our evaluations and for general use, we recommend temperature 1.0, top-p 0.95, top-k -1. For agent and code use cases, we recommend temperature 0.6, top-p 0.95, top-k -1.
|
| 85 |
|
| 86 |
Once the server is up, you can query a model with `curl` like in the following example:
|
| 87 |
```bash
|