Spaces:

lablab-ai-amd-developer-hackathon
/

ROCmPort-AI

Running

App Files Files Community

ROCmPort-AI / artifacts /test-output /migration_report.md

Nawangdorjay

Deploy ROCmPort AI — CUDA-to-ROCm migration scanner

786f63c verified 3 days ago

preview code

raw

history blame contribute delete

8 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

ROCmPort AI Migration Report: cuda_first_repo

AMD Readiness Score

Before deterministic fixes: 42/100
Migration package generated: 67/100
This score means ROCm migration artifacts were generated and are ready for AMD Developer Cloud validation; it is not a production certification.

Category	Before	Migration package
Code portability	0	46
Environment readiness	0	0
Serving readiness	80	96
Benchmark readiness	30	92
Deployment readiness	100	100

Findings

Severity	Category	Location	Finding	Suggested fix
medium	Environment readiness	`benchmarks/benchmark.py:13`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Benchmark readiness	`benchmarks/benchmark.py:22`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high	Benchmark readiness	`benchmarks/benchmark.py:24`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high	Code portability	`benchmarks/benchmark.py:36`	torch.device is hardcoded to CUDA.	Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high	Code portability	`benchmarks/benchmark.py:38`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high	Code portability	`benchmarks/benchmark.py:41`	Tensor or module transfer hardcodes the CUDA device string.	Replace .to("cuda") with .to(_rocmport_device).
medium	Environment readiness	`docker-compose.yml:6`	NVIDIA container environment variable found.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium	Environment readiness	`docker-compose.yml:7`	NVIDIA container environment variable found.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium	Environment readiness	`docker-compose.yml:8`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium	Environment readiness	`docker-compose.yml:24`	NVIDIA container environment variable found.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium	Environment readiness	`docker-compose.yml:25`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Environment readiness	`docker-compose.yml:29`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low	Serving readiness	`docker-compose.yml:30`	vLLM serving command found without explicit ROCm container guidance.	Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.
high	Environment readiness	`Dockerfile:1`	Dockerfile uses an NVIDIA CUDA base image.	Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads.
medium	Environment readiness	`Dockerfile:8`	NVIDIA container environment variable found.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Code portability	`infer.py:6`	torch.device is hardcoded to CUDA.	Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high	Code portability	`infer.py:11`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high	Code portability	`infer.py:12`	Tensor or module transfer hardcodes the CUDA device string.	Replace .to("cuda") with .to(_rocmport_device).
low	Code portability	`infer.py:19`	CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace.	Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
medium	Environment readiness	`requirements.txt:4`	Dependency references a CUDA-specific package.	Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds.
medium	Environment readiness	`requirements.txt:5`	Dependency references a CUDA-specific package.	Replace CUDA-specific wheels with ROCm-compatible PyTorch or library builds.
medium	Environment readiness	`scripts/serve_vllm.sh:4`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Environment readiness	`scripts/serve_vllm.sh:5`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low	Serving readiness	`scripts/serve_vllm.sh:6`	vLLM serving command found without explicit ROCm container guidance.	Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.
medium	Environment readiness	`scripts/train.py:13`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
medium	Environment readiness	`scripts/train.py:14`	CUDA toolkit path environment variable found.	Remove CUDA toolkit path assumptions or replace with ROCm installation paths when required.
high	Code portability	`scripts/train.py:18`	torch.device is hardcoded to CUDA.	Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
low	Code portability	`scripts/train.py:19`	CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace.	Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
high	Code portability	`scripts/train.py:30`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high	Code portability	`scripts/train.py:35`	Tensor or module transfer hardcodes the CUDA device string.	Replace .to("cuda") with .to(_rocmport_device).
high	Code portability	`scripts/train.py:36`	Tensor or module transfer hardcodes the CUDA device string.	Replace .to("cuda") with .to(_rocmport_device).
high	Code portability	`scripts/train.py:44`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high	Code portability	`scripts/train.py:45`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
low	Code portability	`scripts/train.py:59`	CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace.	Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.

Generated Artifacts

rocm_patch.diff contains deterministic MVP fixes.
Dockerfile.rocm uses the ROCm-enabled vLLM container.
amd_developer_cloud_runbook.md documents the validation path.
benchmark_result.json records the AMD benchmark schema and status.

Qwen Agent Notes

Qwen endpoint was not configured. The report uses deterministic scanner output only.

Remaining Risks

CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review.
Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud.
ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration.