Spaces:

lablab-ai-amd-developer-hackathon
/

ROCmPort-AI

Configuration error

App Files Files Community

tazwarrrr commited on 5 days ago

Commit

b521314

1 Parent(s): 90bb277

update docsv2

Browse files

Files changed (2) hide show

.env.example +12 -34
README.md +15 -0

.env.example CHANGED Viewed

@@ -1,41 +1,19 @@
-# ============================================================
-# ROCmPort AI — Environment Configuration
-# Copy this file to .env and fill in your values.
-# ============================================================
-# ------------------------------------------------------------
-# Option 1 (DEFAULT): Groq — LLaMA-3.3-70B, free, fast
-# Get your key at: https://console.groq.com
-# ------------------------------------------------------------
-GROQ_API_KEY=your_groq_api_key_here
 GROQ_MODEL=llama-3.3-70b-versatile
-# ------------------------------------------------------------
-# Option 2: Qwen via HuggingFace Inference API (free tier)
-# Activates Qwen/Qwen2.5-Coder-32B-Instruct — purpose-built
-# for code tasks. Qualifies for AMD hackathon Qwen bonus prize.
-# Get your key at: https://huggingface.co/settings/tokens
-# Set USE_QWEN=true to activate (overrides Groq).
-# ------------------------------------------------------------
-# USE_QWEN=true
-# QWEN_API_KEY=hf_your_huggingface_token_here
-# QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
-# QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
-# ------------------------------------------------------------
-# Option 3: vLLM on AMD Developer Cloud (production / MI300X)
-# Spin up a vLLM server on your AMD instance, then set:
-# Set USE_VLLM=true to activate (overrides Groq and Qwen).
-# ------------------------------------------------------------
 # USE_VLLM=true
-# VLLM_BASE_URL=http://your-amd-cloud-instance:8000/v1
-# VLLM_API_KEY=your_vllm_key_here
 # VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
-# ------------------------------------------------------------
-# AMD ROCm toolchain (set true on AMD Developer Cloud)
-# When true: real hipcc + rocprof run instead of demo data.
-# ------------------------------------------------------------
-ROCM_AVAILABLE=false
 HIPCC_PATH=hipcc
-ROCPROF_PATH=rocprof

+# Primary: Qwen2.5-Coder-32B via HuggingFace (AMD hackathon Qwen prize eligible)
+USE_QWEN=true
+QWEN_API_KEY=hf_your_token_here
+QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
+QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
+# Fallback: Groq LLaMA (when Qwen unavailable)
+GROQ_API_KEY=your_groq_key
 GROQ_MODEL=llama-3.3-70b-versatile
+# AMD DevCloud production (vLLM on MI300X)
 # USE_VLLM=true
+# VLLM_BASE_URL=http://your-amd-cloud:8000/v1
 # VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
+# ROCm toolchain
+ROCM_AVAILABLE=true
 HIPCC_PATH=hipcc
+ROCPROF_PATH=rocprof

README.md CHANGED Viewed

@@ -113,6 +113,21 @@ At least one failure path is documented with source, output, root cause, and fix
 This is intentional: credibility improves when the system's failure boundary is visible.
 ## Quick Start
 ### Option 1: Startup Script

 This is intentional: credibility improves when the system's failure boundary is visible.
+## LLM Configuration
+| Agent | Model | Why |
+|-------|-------|-----|
+| Analyzer | Qwen2.5-Coder-32B | Purpose-built for code reasoning |
+| Translator | Qwen2.5-Coder-32B | Best-in-class CUDA/HIP translation |
+| Optimizer | Qwen2.5-Coder-32B | Hardware-aware optimization proposals |
+| Tester | llama-3.3-70b | Fast log parsing, cost-efficient fallback |
+**Primary**: Qwen2.5-Coder-32B via HuggingFace Inference API
+**Production**: Qwen2.5-Coder-32B via vLLM on AMD MI300X DevCloud
+**Estimated cost**: ~$0.003 per kernel migration (8K tokens avg)
+---
 ## Quick Start
 ### Option 1: Startup Script