tazwarrrr commited on
Commit
b521314
·
1 Parent(s): 90bb277

update docsv2

Browse files
Files changed (2) hide show
  1. .env.example +12 -34
  2. README.md +15 -0
.env.example CHANGED
@@ -1,41 +1,19 @@
1
- # ============================================================
2
- # ROCmPort AI — Environment Configuration
3
- # Copy this file to .env and fill in your values.
4
- # ============================================================
 
5
 
6
- # ------------------------------------------------------------
7
- # Option 1 (DEFAULT): Groq — LLaMA-3.3-70B, free, fast
8
- # Get your key at: https://console.groq.com
9
- # ------------------------------------------------------------
10
- GROQ_API_KEY=your_groq_api_key_here
11
  GROQ_MODEL=llama-3.3-70b-versatile
12
 
13
- # ------------------------------------------------------------
14
- # Option 2: Qwen via HuggingFace Inference API (free tier)
15
- # Activates Qwen/Qwen2.5-Coder-32B-Instruct — purpose-built
16
- # for code tasks. Qualifies for AMD hackathon Qwen bonus prize.
17
- # Get your key at: https://huggingface.co/settings/tokens
18
- # Set USE_QWEN=true to activate (overrides Groq).
19
- # ------------------------------------------------------------
20
- # USE_QWEN=true
21
- # QWEN_API_KEY=hf_your_huggingface_token_here
22
- # QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
23
- # QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
24
-
25
- # ------------------------------------------------------------
26
- # Option 3: vLLM on AMD Developer Cloud (production / MI300X)
27
- # Spin up a vLLM server on your AMD instance, then set:
28
- # Set USE_VLLM=true to activate (overrides Groq and Qwen).
29
- # ------------------------------------------------------------
30
  # USE_VLLM=true
31
- # VLLM_BASE_URL=http://your-amd-cloud-instance:8000/v1
32
- # VLLM_API_KEY=your_vllm_key_here
33
  # VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
34
 
35
- # ------------------------------------------------------------
36
- # AMD ROCm toolchain (set true on AMD Developer Cloud)
37
- # When true: real hipcc + rocprof run instead of demo data.
38
- # ------------------------------------------------------------
39
- ROCM_AVAILABLE=false
40
  HIPCC_PATH=hipcc
41
- ROCPROF_PATH=rocprof
 
1
+ # Primary: Qwen2.5-Coder-32B via HuggingFace (AMD hackathon Qwen prize eligible)
2
+ USE_QWEN=true
3
+ QWEN_API_KEY=hf_your_token_here
4
+ QWEN_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
5
+ QWEN_BASE_URL=https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1
6
 
7
+ # Fallback: Groq LLaMA (when Qwen unavailable)
8
+ GROQ_API_KEY=your_groq_key
 
 
 
9
  GROQ_MODEL=llama-3.3-70b-versatile
10
 
11
+ # AMD DevCloud production (vLLM on MI300X)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  # USE_VLLM=true
13
+ # VLLM_BASE_URL=http://your-amd-cloud:8000/v1
 
14
  # VLLM_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
15
 
16
+ # ROCm toolchain
17
+ ROCM_AVAILABLE=true
 
 
 
18
  HIPCC_PATH=hipcc
19
+ ROCPROF_PATH=rocprof
README.md CHANGED
@@ -113,6 +113,21 @@ At least one failure path is documented with source, output, root cause, and fix
113
 
114
  This is intentional: credibility improves when the system's failure boundary is visible.
115
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
  ## Quick Start
117
 
118
  ### Option 1: Startup Script
 
113
 
114
  This is intentional: credibility improves when the system's failure boundary is visible.
115
 
116
+ ## LLM Configuration
117
+
118
+ | Agent | Model | Why |
119
+ |-------|-------|-----|
120
+ | Analyzer | Qwen2.5-Coder-32B | Purpose-built for code reasoning |
121
+ | Translator | Qwen2.5-Coder-32B | Best-in-class CUDA/HIP translation |
122
+ | Optimizer | Qwen2.5-Coder-32B | Hardware-aware optimization proposals |
123
+ | Tester | llama-3.3-70b | Fast log parsing, cost-efficient fallback |
124
+
125
+ **Primary**: Qwen2.5-Coder-32B via HuggingFace Inference API
126
+ **Production**: Qwen2.5-Coder-32B via vLLM on AMD MI300X DevCloud
127
+ **Estimated cost**: ~$0.003 per kernel migration (8K tokens avg)
128
+
129
+ ---
130
+
131
  ## Quick Start
132
 
133
  ### Option 1: Startup Script