Spaces:
Sleeping
Sleeping
Commit ·
6396da4
1
Parent(s): 7e3d2d7
run inference.py
Browse files- .gitignore +0 -1
- README.md +6 -5
- baseline_scores.json +8 -0
- inference_log.txt +0 -0
.gitignore
CHANGED
|
@@ -7,6 +7,5 @@ build/
|
|
| 7 |
venv/
|
| 8 |
.env
|
| 9 |
*.env
|
| 10 |
-
baseline_scores.json
|
| 11 |
.DS_Store
|
| 12 |
.claude/
|
|
|
|
| 7 |
venv/
|
| 8 |
.env
|
| 9 |
*.env
|
|
|
|
| 10 |
.DS_Store
|
| 11 |
.claude/
|
README.md
CHANGED
|
@@ -164,13 +164,14 @@ python inference.py
|
|
| 164 |
|
| 165 |
| Task | Difficulty | Score |
|
| 166 |
|------|------------|--------|
|
| 167 |
-
| 1 | Easy |
|
| 168 |
-
| 2 | Medium |
|
| 169 |
-
| 3 | Hard |
|
| 170 |
-
| avg | — |
|
| 171 |
|
| 172 |
-
*(Scores produced by `
|
| 173 |
|
|
|
|
| 174 |
---
|
| 175 |
|
| 176 |
## Project Structure
|
|
|
|
| 164 |
|
| 165 |
| Task | Difficulty | Score |
|
| 166 |
|------|------------|--------|
|
| 167 |
+
| 1 | Easy | 1.000 |
|
| 168 |
+
| 2 | Medium | 1.000 |
|
| 169 |
+
| 3 | Hard | 1.000 |
|
| 170 |
+
| avg | — | 1.000 |
|
| 171 |
|
| 172 |
+
*(Scores produced by `google/gemma-3-27b-it` via NVIDIA NIM, temperature=0)*
|
| 173 |
|
| 174 |
+
> Full agent step-by-step logs available in `inference_log.txt`
|
| 175 |
---
|
| 176 |
|
| 177 |
## Project Structure
|
baseline_scores.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"scores": {
|
| 3 |
+
"task1": 1.0,
|
| 4 |
+
"task2": 1.0,
|
| 5 |
+
"task3": 1.0
|
| 6 |
+
},
|
| 7 |
+
"average": 1.0
|
| 8 |
+
}
|
inference_log.txt
ADDED
|
Binary file (6.72 kB). View file
|
|
|