LLM fine-tuning (DPO/RLHF with LoRA on multi-GPU H100s), agentic AI systems, Monte Carlo Tree Search for code generation, multi-agent orchestration, retrieval-augmented generation, and multimodal vision-language models.
Built CodeQ — an MCTS + DPO self-improving code debugging agent achieving 84% fix rate on DebugBench using Qwen2.5-Coder-7B-Instruct. Currently working on VisionTriage (QLoRA fine-tuning of Qwen2.5-VL-7B-Instruct for visual bug triage).
PhD candidate at NMSU. Published in IEEE TPAMI & IEEE/ACM TCBB (81+ citations). Two CRAN R packages.