Update README.md
Browse files
README.md
CHANGED
|
@@ -40,13 +40,13 @@ We are excited to release Infinity-Parser2-Pro, our latest flagship document und
|
|
| 40 |
#### Pre-requisites
|
| 41 |
|
| 42 |
```bash
|
| 43 |
-
# Install PyTorch (CUDA). Find the proper version
|
| 44 |
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu128
|
| 45 |
|
| 46 |
# Install FlashAttention (required for NVIDIA GPUs).
|
| 47 |
# This command builds flash-attn from source, which can take 10 to 30 minutes.
|
| 48 |
pip install flash-attn==2.8.3 --no-build-isolation
|
| 49 |
-
# For Hopper GPUs (e.g. H100, H800), we recommend FlashAttention-3 instead. See the
|
| 50 |
|
| 51 |
# Install vLLM
|
| 52 |
# NOTE: you may need to run the command below to resolve triton and numpy conflicts before installing vllm.
|
|
|
|
| 40 |
#### Pre-requisites
|
| 41 |
|
| 42 |
```bash
|
| 43 |
+
# Install PyTorch (CUDA). Find the proper version at https://pytorch.org/get-started/previous-versions based on your CUDA version.
|
| 44 |
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu128
|
| 45 |
|
| 46 |
# Install FlashAttention (required for NVIDIA GPUs).
|
| 47 |
# This command builds flash-attn from source, which can take 10 to 30 minutes.
|
| 48 |
pip install flash-attn==2.8.3 --no-build-isolation
|
| 49 |
+
# For Hopper GPUs (e.g. H100, H800), we recommend FlashAttention-3 instead. See the official guide at https://github.com/Dao-AILab/flash-attention.
|
| 50 |
|
| 51 |
# Install vLLM
|
| 52 |
# NOTE: you may need to run the command below to resolve triton and numpy conflicts before installing vllm.
|