Spaces:

lablab-ai-amd-developer-hackathon
/

ROCmPort-AI

Running

App Files Files Community

ROCmPort-AI / artifacts /check2 /migration_report.md

Nawangdorjay

Deploy ROCmPort AI — CUDA-to-ROCm migration scanner

786f63c verified 3 days ago

preview code

raw

history blame contribute delete

3.16 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

ROCmPort AI Migration Report: cuda_first_repo

AMD Readiness Score

Before deterministic fixes: 51/100
After deterministic fixes: 100/100

Category	Before	After
Code portability	0	100
Environment readiness	0	100
Serving readiness	90	100
Benchmark readiness	65	100
Deployment readiness	100	100

Findings

Severity	Category	Location	Finding	Suggested fix
high	Benchmark readiness	`benchmarks/benchmark.py:6`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
high	Environment readiness	`Dockerfile:1`	Dockerfile uses an NVIDIA CUDA base image.	Use vllm/vllm-openai-rocm:latest for vLLM serving or rocm/pytorch:latest for PyTorch workloads.
medium	Environment readiness	`Dockerfile:8`	NVIDIA container environment variable found.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Code portability	`infer.py:6`	torch.device is hardcoded to CUDA.	Use torch.device("cuda" if torch.cuda.is_available() else "cpu"); ROCm PyTorch reports AMD GPUs through torch.cuda.
high	Code portability	`infer.py:11`	PyTorch tensor or module is moved with a hardcoded .cuda() call.	Replace .cuda() with .to(_rocmport_device) and define a runtime device abstraction.
high	Code portability	`infer.py:12`	Tensor or module transfer hardcodes the CUDA device string.	Replace .to("cuda") with .to(_rocmport_device).
low	Code portability	`infer.py:19`	CUDA availability check may confuse ROCm users because PyTorch ROCm still uses the torch.cuda namespace.	Keep the API call but document that it covers AMD GPUs under ROCm PyTorch.
medium	Environment readiness	`scripts/serve_vllm.sh:4`	CUDA_VISIBLE_DEVICES is used for GPU selection.	Use HIP_VISIBLE_DEVICES or ROCR_VISIBLE_DEVICES for AMD GPU targeting.
high	Environment readiness	`scripts/serve_vllm.sh:5`	NVIDIA-specific GPU inspection command found.	Use rocm-smi for AMD GPU monitoring and benchmark metadata collection.
low	Serving readiness	`scripts/serve_vllm.sh:6`	vLLM serving command found without explicit ROCm container guidance.	Run vLLM inside vllm/vllm-openai-rocm with /dev/kfd, /dev/dri, host IPC, and video group access.

Generated Artifacts

rocm_patch.diff contains deterministic MVP fixes.
Dockerfile.rocm uses the ROCm-enabled vLLM container.
amd_developer_cloud_runbook.md documents the validation path.
benchmark_result.json records the AMD benchmark schema and status.

Qwen Agent Notes

Qwen endpoint was not configured. The report uses deterministic scanner output only.

Remaining Risks

CUDA C++ kernels, custom Triton kernels, and CUDA-only binary dependencies require manual review.
Uploaded repositories are not executed inside the Space; live validation belongs on AMD Developer Cloud.
ROCm performance depends on model, batch shape, vLLM version, ROCm version, and GPU instance configuration.