Spaces:
Runtime error
Runtime error
| # ROCmPort AI Migration Report: cuda_first_repo | |
| ## AMD Readiness Score | |
| - Before deterministic fixes: 53/100 | |
| - After deterministic fixes: 100/100 | |
| | Category | Before | After | | |
| | --- | ---: | ---: | | |
| | Code portability | 0 | 100 | | |
| | Environment readiness | 8 | 100 | | |
| | Serving readiness | 90 | 100 | | |
| | Benchmark readiness | 65 | 100 | | |
| | Deployment readiness | 100 | 100 | | |
| ## Findings | |
| | Severity | Category | Location | Finding | Suggested fix | | |
| | --- | --- | --- | --- | --- | | |
| | high | Benchmark readiness | `benchmarks/benchmark.py:6` | NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. | | |
| | high | Environment readiness | `Dockerfile:1` | Dockerfile uses an NVIDIA CUDA base image. | Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads. | | |
| | medium | Environment readiness | `Dockerfile:8` | NVIDIA container environment variable found. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. | | |
| | high | Code portability | `infer.py:6` | torch.device is hardcoded to CUDA. | Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda. | | |
| | high | Code portability | `infer.py:11` | PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. | | |
| | high | Code portability | `infer.py:12` | Tensor or module transfer hardcodes the CUDA device string. | Replace .to("cuda") with .to(_rocmport_device). | | |
| | low | Code portability | `infer.py:19` | CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. | Keep the API call but document that it covers AMD GPUs under ROCm PyTorch. | | |
| | high | Environment readiness | `scripts/serve_vllm.sh:5` | NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. | | |
| | low | Serving readiness | `scripts/serve_vllm.sh:6` | vLLM serving command found without explicit ROCm container guidance. | Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access. | | |
| ## Generated Artifacts | |
| - `rocm_patch.diff` contains deterministic MVP fixes. | |
| - `Dockerfile.rocm` uses the ROCm-enabled vLLM container. | |
| - `amd_developer_cloud_runbook.md` documents the validation path. | |
| - `benchmark_result.json` records the AMD benchmark schema and status. | |
| ## Qwen Agent Notes | |
| Qwen endpoint was not configured. The report uses deterministic scanner output only. | |
| ## Remaining Risks | |
| - CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review. | |
| - Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud. | |
| - ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration. | |