RedHatAI
/

Trinity-Large-Thinking-NVFP4

compressed-tensors

8-bit precision

Model card Files Files and versions

dsikka commited on 17 days ago

Commit

f9ea443

·

verified ·

1 Parent(s): 7657f94

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,4 +14,4 @@ name: RedHatAI/Trinity-Large-Thinking-NVFP4
 This is a preliminary version (and subject to change) of NVFP4 quantized [arcee-ai/Trinity-Large-Thinking ](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/tree/main ) model.
 The model has both weights and activations quantized to NVFP4 format with [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor).
-It is compatible and tested against vllm main.

 This is a preliminary version (and subject to change) of NVFP4 quantized [arcee-ai/Trinity-Large-Thinking ](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/tree/main ) model.
 The model has both weights and activations quantized to NVFP4 format with [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor).
+It is compatible and tested against vllm main. Run it with ```vllm serve RedHatAI/Trinity-Large-Thinking-NVFP4 --trust-remote-code```