rocmport-agentic / artifacts /check2 /migration_report.md
Nawangdorjay's picture
Deploy ROCmPort AI — CUDA-to-ROCm migration scanner
f6e0440 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

ROCmPort AI Migration Report: cuda_first_repo

AMD Readiness Score

  • Before deterministic fixes: 51/100
  • After deterministic fixes: 100/100
Category Before After
Code portability 0 100
Environment readiness 0 100
Serving readiness 90 100
Benchmark readiness 65 100
Deployment readiness 100 100

Findings

Severity Category Location Finding Suggested fix
high Benchmark readiness benchmarks/benchmark.py:6 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high Environment readiness Dockerfile:1 Dockerfile uses an NVIDIA CUDA base image. Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads.
medium Environment readiness Dockerfile:8 NVIDIA container environment variable found. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Code portability infer.py:6 torch.device is hardcoded to CUDA. Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high Code portability infer.py:11 PyTorch tensor or module is moved with a hardcoded .cuda() call. Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high Code portability infer.py:12 Tensor or module transfer hardcodes the CUDA device string. Replace .to("cuda") with .to(_rocmport_device).
low Code portability infer.py:19 CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace. Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
medium Environment readiness scripts/serve_vllm.sh:4 CUDA_VISIBLE_DEVICES is used for GPU selection. Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high Environment readiness scripts/serve_vllm.sh:5 NVIDIA-specific GPU inspection command found. Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low Serving readiness scripts/serve_vllm.sh:6 vLLM serving command found without explicit ROCm container guidance. Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.

Generated Artifacts

  • rocm_patch.diff contains deterministic MVP fixes.
  • Dockerfile.rocm uses the ROCm-enabled vLLM container.
  • amd_developer_cloud_runbook.md documents the validation path.
  • benchmark_result.json records the AMD benchmark schema and status.

Qwen Agent Notes

Qwen endpoint was not configured. The report uses deterministic scanner output only.

Remaining Risks

  • CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review.
  • Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud.
  • ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration.