| library_name: transformers | |
| base_model: | |
| - deepseek-ai/DeepSeek-V4-Pro | |
| Tiny random version of DeepSeek-V4. | |
| **In development. Might not work!** | |
| ```bash | |
| vllm serve yujiepan/deepseek-v4-tiny-random \ | |
| --trust-remote-code \ | |
| --block-size 256 \ | |
| --kv-cache-dtype fp8 \ | |
| --data-parallel-size 1 \ | |
| --max-model-len 12000 \ | |
| --gpu-memory-utilization 0.5 \ | |
| --max-num-seqs 512 \ | |
| --max-num-batched-tokens 512 \ | |
| --no-enable-flashinfer-autotune \ | |
| --compilation-config '{"mode": 0, "cudagraph_mode": "FULL_DECODE_ONLY"}' \ | |
| --tokenizer-mode deepseek_v4 \ | |
| --tool-call-parser deepseek_v4 \ | |
| --enable-auto-tool-choice \ | |
| --reasoning-parser deepseek_v4 \ | |
| --speculative_config '{"method":"mtp","num_speculative_tokens":1}' | |
| ``` | |