Ghost-Coder: Qwen2.5-32B CUDA-to-HIP Translator
Ghost-Coder is a specialized LLM designed to bridge the gap between NVIDIA's proprietary CUDA and AMD's open ROCm ecosystem. This model is a fine-tuned version of Qwen2.5-Coder-32B-Instruct, optimized specifically for high-fidelity translation of GPU kernels.
Developed for the Lablab.ai AMD Developer Hackathon (2026).
π Model Highlights
- Specialization: Maps complex CUDA logic (memory management, warp primitives, kernels) to functional AMD HIP code.
- Hardware-Aware: Fine-tuned specifically for execution on AMD Instinct hardware.
- Agent-Ready: Designed to be the "brain" of an autonomous, self-healing compiler loop.
π οΈ Training Details
The model was fine-tuned using the Unsloth framework on a high-speed sprint configuration to maximize generalization.
- Hardware: AMD Instinct MI300X (192GB VRAM)
- Base Model: Qwen2.5-Coder-32B-Instruct (4-bit QLoRA)
- Dataset: Curated subset of CASS (CUDA-to-HIP mapping pairs)
- Context Length: 4096
- Training Steps: 200
- Global Batch Size: 64
π§ Intended Use
Ghost-Coder is intended for use in the Ghost-Harness, an agentic workflow that:
- Translates CUDA source code to HIP.
- Attempts compilation via
hipcc. - Self-corrects based on compiler error feedback.
π Acknowledgements
Special thanks to AMD and Lablab.ai for providing the compute resources and the platform to build across the AI stack.
Created by Talha
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for muhammadtlha944/Ghost-Coder-Qwen2.5-32B-LoRA
Base model
Qwen/Qwen2.5-32B Finetuned
Qwen/Qwen2.5-Coder-32B Finetuned
Qwen/Qwen2.5-Coder-32B-Instruct