Spaces:
Configuration error
Configuration error
update docsv2
Browse files- .env.example +12 -34
- README.md +15 -0
.env.example
CHANGED
|
@@ -1,41 +1,19 @@
|
|
| 1 |
-
#
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
|
|
|
| 5 |
|
| 6 |
-
#
|
| 7 |
-
|
| 8 |
-
# Get your key at: https://console.groq.com
|
| 9 |
-
# ------------------------------------------------------------
|
| 10 |
-
GROQ_API_KEY=your_groq_api_key_here
|
| 11 |
GROQ_MODEL=llama-3.3-70b-versatile
|
| 12 |
|
| 13 |
-
#
|
| 14 |
-
# Option 2: Qwen via HuggingFace Inference API (free tier)
|
| 15 |
-
# Activates Qwen/Qwen2.5-Coder-32B-Instruct — purpose-built
|
| 16 |
-
# for code tasks. Qualifies for AMD hackathon Qwen bonus prize.
|
| 17 |
-
# Get your key at: https://huggingface.co/settings/tokens
|
| 18 |
-
# Set USE_QWEN=true to activate (overrides Groq).
|
| 19 |
-
# ------------------------------------------------------------
|
| 20 |
-
# USE_QWEN=true
|
| 21 |
-
# QWEN_API_KEY=hf_your_huggingface_token_here
|
| 22 |
-
# QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
|
| 23 |
-
# QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
|
| 24 |
-
|
| 25 |
-
# ------------------------------------------------------------
|
| 26 |
-
# Option 3: vLLM on AMD Developer Cloud (production / MI300X)
|
| 27 |
-
# Spin up a vLLM server on your AMD instance, then set:
|
| 28 |
-
# Set USE_VLLM=true to activate (overrides Groq and Qwen).
|
| 29 |
-
# ------------------------------------------------------------
|
| 30 |
# USE_VLLM=true
|
| 31 |
-
# VLLM_BASE_URL=http://your-amd-cloud
|
| 32 |
-
# VLLM_API_KEY=your_vllm_key_here
|
| 33 |
# VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
|
| 34 |
|
| 35 |
-
#
|
| 36 |
-
|
| 37 |
-
# When true: real hipcc + rocprof run instead of demo data.
|
| 38 |
-
# ------------------------------------------------------------
|
| 39 |
-
ROCM_AVAILABLE=false
|
| 40 |
HIPCC_PATH=hipcc
|
| 41 |
-
ROCPROF_PATH=rocprof
|
|
|
|
| 1 |
+
# Primary: Qwen2.5-Coder-32B via HuggingFace (AMD hackathon Qwen prize eligible)
|
| 2 |
+
USE_QWEN=true
|
| 3 |
+
QWEN_API_KEY=hf_your_token_here
|
| 4 |
+
QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
|
| 5 |
+
QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
|
| 6 |
|
| 7 |
+
# Fallback: Groq LLaMA (when Qwen unavailable)
|
| 8 |
+
GROQ_API_KEY=your_groq_key
|
|
|
|
|
|
|
|
|
|
| 9 |
GROQ_MODEL=llama-3.3-70b-versatile
|
| 10 |
|
| 11 |
+
# AMD DevCloud production (vLLM on MI300X)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
# USE_VLLM=true
|
| 13 |
+
# VLLM_BASE_URL=http://your-amd-cloud:8000/v1
|
|
|
|
| 14 |
# VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
|
| 15 |
|
| 16 |
+
# ROCm toolchain
|
| 17 |
+
ROCM_AVAILABLE=true
|
|
|
|
|
|
|
|
|
|
| 18 |
HIPCC_PATH=hipcc
|
| 19 |
+
ROCPROF_PATH=rocprof
|
README.md
CHANGED
|
@@ -113,6 +113,21 @@ At least one failure path is documented with source, output, root cause, and fix
|
|
| 113 |
|
| 114 |
This is intentional: credibility improves when the system's failure boundary is visible.
|
| 115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
## Quick Start
|
| 117 |
|
| 118 |
### Option 1: Startup Script
|
|
|
|
| 113 |
|
| 114 |
This is intentional: credibility improves when the system's failure boundary is visible.
|
| 115 |
|
| 116 |
+
## LLM Configuration
|
| 117 |
+
|
| 118 |
+
| Agent | Model | Why |
|
| 119 |
+
|-------|-------|-----|
|
| 120 |
+
| Analyzer | Qwen2.5-Coder-32B | Purpose-built for code reasoning |
|
| 121 |
+
| Translator | Qwen2.5-Coder-32B | Best-in-class CUDA/HIP translation |
|
| 122 |
+
| Optimizer | Qwen2.5-Coder-32B | Hardware-aware optimization proposals |
|
| 123 |
+
| Tester | llama-3.3-70b | Fast log parsing, cost-efficient fallback |
|
| 124 |
+
|
| 125 |
+
**Primary**: Qwen2.5-Coder-32B via HuggingFace Inference API
|
| 126 |
+
**Production**: Qwen2.5-Coder-32B via vLLM on AMD MI300X DevCloud
|
| 127 |
+
**Estimated cost**: ~$0.003 per kernel migration (8K tokens avg)
|
| 128 |
+
|
| 129 |
+
---
|
| 130 |
+
|
| 131 |
## Quick Start
|
| 132 |
|
| 133 |
### Option 1: Startup Script
|