srishtichugh commited on
Commit
6396da4
·
1 Parent(s): 7e3d2d7

run inference.py

Browse files
Files changed (4) hide show
  1. .gitignore +0 -1
  2. README.md +6 -5
  3. baseline_scores.json +8 -0
  4. inference_log.txt +0 -0
.gitignore CHANGED
@@ -7,6 +7,5 @@ build/
7
  venv/
8
  .env
9
  *.env
10
- baseline_scores.json
11
  .DS_Store
12
  .claude/
 
7
  venv/
8
  .env
9
  *.env
 
10
  .DS_Store
11
  .claude/
README.md CHANGED
@@ -164,13 +164,14 @@ python inference.py
164
 
165
  | Task | Difficulty | Score |
166
  |------|------------|--------|
167
- | 1 | Easy | ~0.950 |
168
- | 2 | Medium | ~0.800 |
169
- | 3 | Hard | ~0.700 |
170
- | avg | — | ~0.817 |
171
 
172
- *(Scores produced by `gpt-4o-mini` with greedy decoding, temperature=0)*
173
 
 
174
  ---
175
 
176
  ## Project Structure
 
164
 
165
  | Task | Difficulty | Score |
166
  |------|------------|--------|
167
+ | 1 | Easy | 1.000 |
168
+ | 2 | Medium | 1.000 |
169
+ | 3 | Hard | 1.000 |
170
+ | avg | — | 1.000 |
171
 
172
+ *(Scores produced by `google/gemma-3-27b-it` via NVIDIA NIM, temperature=0)*
173
 
174
+ > Full agent step-by-step logs available in `inference_log.txt`
175
  ---
176
 
177
  ## Project Structure
baseline_scores.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "scores": {
3
+ "task1": 1.0,
4
+ "task2": 1.0,
5
+ "task3": 1.0
6
+ },
7
+ "average": 1.0
8
+ }
inference_log.txt ADDED
Binary file (6.72 kB). View file