A newer version of the Gradio SDK is available: 6.14.0
ROCmPort AI Migration Report: cuda_first_repo
AMD Readiness Score
- Before deterministic fixes: 42/100
- Migration package generated: 67/100
- This score means ROCm migration artifacts were generated and are ready for AMD Developer Cloud validation; it is not a production certification.
| Category | Before | Migration package |
|---|---|---|
| Code portability | 0 | 46 |
| Environment readiness | 0 | 0 |
| Serving readiness | 80 | 96 |
| Benchmark readiness | 30 | 92 |
| Deployment readiness | 100 | 100 |
Findings
| Severity | Category | Location | Finding | Suggested fix |
|---|---|---|---|---|
| medium | Environment readiness | benchmarks/benchmark.py:13 |
CUDA_VISIBLE_DEVICES is used for GPU selection. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| high | Benchmark readiness | benchmarks/benchmark.py:22 |
NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. |
| high | Benchmark readiness | benchmarks/benchmark.py:24 |
NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. |
| high | Code portability | benchmarks/benchmark.py:36 |
torch.device is hardcoded to CUDA. | Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda. |
| high | Code portability | benchmarks/benchmark.py:38 |
PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. |
| high | Code portability | benchmarks/benchmark.py:41 |
Tensor or module transfer hardcodes the CUDA device string. | Replace .to("cuda") with .to(_rocmport_device). |
| medium | Environment readiness | docker-compose.yml:6 |
NVIDIA container environment variable found. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| medium | Environment readiness | docker-compose.yml:7 |
NVIDIA container environment variable found. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| medium | Environment readiness | docker-compose.yml:8 |
CUDA_VISIBLE_DEVICES is used for GPU selection. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| medium | Environment readiness | docker-compose.yml:24 |
NVIDIA container environment variable found. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| medium | Environment readiness | docker-compose.yml:25 |
CUDA_VISIBLE_DEVICES is used for GPU selection. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| high | Environment readiness | docker-compose.yml:29 |
NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. |
| low | Serving readiness | docker-compose.yml:30 |
vLLM serving command found without explicit ROCm container guidance. | Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access. |
| high | Environment readiness | Dockerfile:1 |
Dockerfile uses an NVIDIA CUDA base image. | Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads. |
| medium | Environment readiness | Dockerfile:8 |
NVIDIA container environment variable found. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| high | Code portability | infer.py:6 |
torch.device is hardcoded to CUDA. | Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda. |
| high | Code portability | infer.py:11 |
PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. |
| high | Code portability | infer.py:12 |
Tensor or module transfer hardcodes the CUDA device string. | Replace .to("cuda") with .to(_rocmport_device). |
| low | Code portability | infer.py:19 |
CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. | Keep the API call but document that it covers AMD GPUs under ROCm PyTorch. |
| medium | Environment readiness | requirements.txt:4 |
Dependency references a CUDA-specific package. | Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds. |
| medium | Environment readiness | requirements.txt:5 |
Dependency references a CUDA-specific package. | Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds. |
| medium | Environment readiness | scripts/serve_vllm.sh:4 |
CUDA_VISIBLE_DEVICES is used for GPU selection. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| high | Environment readiness | scripts/serve_vllm.sh:5 |
NVIDIA-specific GPU inspection command found. | Use rocm-smi for AMD GPU monitoring and benchmark metadata collection. |
| low | Serving readiness | scripts/serve_vllm.sh:6 |
vLLM serving command found without explicit ROCm container guidance. | Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access. |
| medium | Environment readiness | scripts/train.py:13 |
CUDA_VISIBLE_DEVICES is used for GPU selection. | Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting. |
| medium | Environment readiness | scripts/train.py:14 |
CUDA toolkit path environment variable found. | Remove CUDA toolkit path assumptions or replace with ROCm installation paths when required. |
| high | Code portability | scripts/train.py:18 |
torch.device is hardcoded to CUDA. | Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda. |
| low | Code portability | scripts/train.py:19 |
CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. | Keep the API call but document that it covers AMD GPUs under ROCm PyTorch. |
| high | Code portability | scripts/train.py:30 |
PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. |
| high | Code portability | scripts/train.py:35 |
Tensor or module transfer hardcodes the CUDA device string. | Replace .to("cuda") with .to(_rocmport_device). |
| high | Code portability | scripts/train.py:36 |
Tensor or module transfer hardcodes the CUDA device string. | Replace .to("cuda") with .to(_rocmport_device). |
| high | Code portability | scripts/train.py:44 |
PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. |
| high | Code portability | scripts/train.py:45 |
PyTorch tensor or module is moved with a hardcoded .cuda() call. | Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction. |
| low | Code portability | scripts/train.py:59 |
CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. | Keep the API call but document that it covers AMD GPUs under ROCm PyTorch. |
Generated Artifacts
rocm_patch.diffcontains deterministic MVP fixes.Dockerfile.rocmuses the ROCm-enabled vLLM container.amd_developer_cloud_runbook.mddocuments the validation path.benchmark_result.jsonrecords the AMD benchmark schema and status.
Qwen Agent Notes
Qwen endpoint was not configured. The report uses deterministic scanner output only.
Remaining Risks
- CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review.
- Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud.
- ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration.