Ghost-Coder: Qwen2.5-32B CUDA-to-HIP Translator

Ghost-Coder is a specialized LLM designed to bridge the gap between NVIDIA's proprietary CUDA and AMD's open ROCm ecosystem. This model is a fine-tuned version of Qwen2.5-Coder-32B-Instruct, optimized specifically for high-fidelity translation of GPU kernels.

Developed for the Lablab.ai AMD Developer Hackathon (2026).

πŸš€ Model Highlights

  • Specialization: Maps complex CUDA logic (memory management, warp primitives, kernels) to functional AMD HIP code.
  • Hardware-Aware: Fine-tuned specifically for execution on AMD Instinct hardware.
  • Agent-Ready: Designed to be the "brain" of an autonomous, self-healing compiler loop.

πŸ› οΈ Training Details

The model was fine-tuned using the Unsloth framework on a high-speed sprint configuration to maximize generalization.

  • Hardware: AMD Instinct MI300X (192GB VRAM)
  • Base Model: Qwen2.5-Coder-32B-Instruct (4-bit QLoRA)
  • Dataset: Curated subset of CASS (CUDA-to-HIP mapping pairs)
  • Context Length: 4096
  • Training Steps: 200
  • Global Batch Size: 64

🧠 Intended Use

Ghost-Coder is intended for use in the Ghost-Harness, an agentic workflow that:

  1. Translates CUDA source code to HIP.
  2. Attempts compilation via hipcc.
  3. Self-corrects based on compiler error feedback.

πŸ“ Acknowledgements

Special thanks to AMD and Lablab.ai for providing the compute resources and the platform to build across the AI stack.


Created by Talha

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for muhammadtlha944/Ghost-Coder-Qwen2.5-32B-LoRA