zancato commited on
Commit
62a72fc
·
verified ·
1 Parent(s): 1d32fa5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -174,10 +174,10 @@ curl http://localhost:8000/v1/chat/completions \
174
  }'
175
  ```
176
 
177
- ### With HuggingFace Transformers
178
 
179
  > [!WARNING]
180
- > Due to the long generations produced by reasoning models, the lower latency provided by vLLM is preferred over Huggingface for evaluations and in production settings. We recommend Huggingface generation primarily for quick debugging or testing.
181
 
182
  ```python
183
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
174
  }'
175
  ```
176
 
177
+ ### With Hugging Face Transformers
178
 
179
  > [!WARNING]
180
+ > Due to the long generations produced by reasoning models, the lower latency provided by vLLM is preferred over Hugging Face for evaluations and in production settings. We recommend Hugging Face generation primarily for quick debugging or testing.
181
 
182
  ```python
183
  from transformers import AutoModelForCausalLM, AutoTokenizer