ROCmPort-AI / artifacts /test-output /migration_report.md
Nawangdorjay's picture
Deploy ROCmPort AI — CUDA-to-ROCm migration scanner
786f63c verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

ROCmPort AI Migration Report: cuda_first_repo

AMD Readiness Score

  • Before deterministic fixes: 42/100
  • Migration package generated: 67/100
  • This score means ROCm migration artifacts were generated and are ready for AMD Developer Cloud validation; it is not a production certification.
Category Before Migration package
Code portability 0 46
Environment readiness 0 0
Serving readiness 80 96
Benchmark readiness 30 92
Deployment readiness 100 100

Findings

Severity Category Location Finding Suggested fix
medium Environment readiness benchmarks/benchmark.py:13 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Benchmark readiness benchmarks/benchmark.py:22 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high Benchmark readiness benchmarks/benchmark.py:24 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high Code portability benchmarks/benchmark.py:36 torch.device is hardcoded to CUDA. Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high Code portability benchmarks/benchmark.py:38 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high Code portability benchmarks/benchmark.py:41 Tensor or module transfer hardcodes the CUDA device string. Replace .to("cuda") with .to(_rocmport_device).
medium Environment readiness docker-compose.yml:6 NVIDIA container environment variable found. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium Environment readiness docker-compose.yml:7 NVIDIA container environment variable found. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium Environment readiness docker-compose.yml:8 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium Environment readiness docker-compose.yml:24 NVIDIA container environment variable found. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium Environment readiness docker-compose.yml:25 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Environment readiness docker-compose.yml:29 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low Serving readiness docker-compose.yml:30 vLLM serving command found without explicit ROCm container guidance. Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.
high Environment readiness Dockerfile:1 Dockerfile uses an NVIDIA CUDA base image. Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads.
medium Environment readiness Dockerfile:8 NVIDIA container environment variable found. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Code portability infer.py:6 torch.device is hardcoded to CUDA. Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high Code portability infer.py:11 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high Code portability infer.py:12 Tensor or module transfer hardcodes the CUDA device string. Replace .to("cuda") with .to(_rocmport_device).
low Code portability infer.py:19 CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
medium Environment readiness requirements.txt:4 Dependency references a CUDA-specific package. Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds.
medium Environment readiness requirements.txt:5 Dependency references a CUDA-specific package. Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds.
medium Environment readiness scripts/serve_vllm.sh:4 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Environment readiness scripts/serve_vllm.sh:5 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low Serving readiness scripts/serve_vllm.sh:6 vLLM serving command found without explicit ROCm container guidance. Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.
medium Environment readiness scripts/train.py:13 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium Environment readiness scripts/train.py:14 CUDA toolkit path environment variable found. Remove CUDA toolkit path assumptions or replace with ROCm installation paths when required.
high Code portability scripts/train.py:18 torch.device is hardcoded to CUDA. Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
low Code portability scripts/train.py:19 CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
high Code portability scripts/train.py:30 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high Code portability scripts/train.py:35 Tensor or module transfer hardcodes the CUDA device string. Replace .to("cuda") with .to(_rocmport_device).
high Code portability scripts/train.py:36 Tensor or module transfer hardcodes the CUDA device string. Replace .to("cuda") with .to(_rocmport_device).
high Code portability scripts/train.py:44 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high Code portability scripts/train.py:45 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
low Code portability scripts/train.py:59 CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.

Generated Artifacts

  • rocm_patch.diff contains deterministic MVP fixes.
  • Dockerfile.rocm uses the ROCm-enabled vLLM container.
  • amd_developer_cloud_runbook.md documents the validation path.
  • benchmark_result.json records the AMD benchmark schema and status.

Qwen Agent Notes

Qwen endpoint was not configured. The report uses deterministic scanner output only.

Remaining Risks

  • CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review.
  • Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud.
  • ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration.