joerowell commited on
Commit
30ca10b
·
verified ·
1 Parent(s): de04ba5

Update vLLM serve command and link to vLLM recipes page

Browse files

Switches the recommended vllm serve invocation to the new flag set (VLLM_USE_DEEP_GEMM=0, --enable-auto-tool-choice, --served-model-name laguna) and adds a pointer to https://recipes.vllm.ai/poolside/Laguna-XS.2.

Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -135,12 +135,16 @@ Serve Laguna XS.2 locally with vLLM and query it from any OpenAI-compatible clie
135
  ```shell
136
  pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
137
 
138
- vllm serve poolside/Laguna-XS.2 \
139
- --max-model-len 131072 \
 
140
  --reasoning-parser poolside_v1 \
141
- --tool-call-parser poolside_v1
 
142
  ```
143
 
 
 
144
  #### Transformers
145
 
146
  Laguna XS.2 is supported in Transformers `v5.7.0` and later ([huggingface/transformers#45673](https://github.com/huggingface/transformers/pull/45673)).
 
135
  ```shell
136
  pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
137
 
138
+ VLLM_USE_DEEP_GEMM=0 vllm serve \
139
+ --model poolside/Laguna-XS.2 \
140
+ --tool-call-parser poolside_v1 \
141
  --reasoning-parser poolside_v1 \
142
+ --enable-auto-tool-choice \
143
+ --served-model-name laguna
144
  ```
145
 
146
+ See the [vLLM recipes page](https://recipes.vllm.ai/poolside/Laguna-XS.2) for additional deployment guidance.
147
+
148
  #### Transformers
149
 
150
  Laguna XS.2 is supported in Transformers `v5.7.0` and later ([huggingface/transformers#45673](https://github.com/huggingface/transformers/pull/45673)).