Update README.md
Browse files
README.md
CHANGED
|
@@ -52,26 +52,6 @@ The model demonstrates strong capabilities in:
|
|
| 52 |
- Limited to reasoning quality within biological contexts (not trained for creative writing or coding)
|
| 53 |
|
| 54 |
|
| 55 |
-
## Evaluation
|
| 56 |
-
|
| 57 |
-
Evaluation on [emre/TARA_Turkish_LLM_Benchmark](https://huggingface.co/datasets/emre/TARA_Turkish_LLM_Benchmark)
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
| Category | BioGenesis-ToT | Qwen3-1.7B |
|
| 61 |
-
| -------------------------------------------------------- | -------------- | ---------- |
|
| 62 |
-
| Scientific Explanation and Hypothesis Evaluation (RAG) | **66.36** | 61.82 |
|
| 63 |
-
| Ethical Dilemma Assessment | **55.45** | 47.27 |
|
| 64 |
-
| Complex Scenario Analysis and Drawing Conclusions | **61.82** | 59.09 |
|
| 65 |
-
| Constrained Creative Writing | **18.18** | 9.09 |
|
| 66 |
-
| Logical Inference (Text-Based) | 49.09 | **68.18** |
|
| 67 |
-
| Mathematical Reasoning | **42.73** | 37.27 |
|
| 68 |
-
| Planning and Optimization Problems (Text-Based) | **52.73** | 25.45 |
|
| 69 |
-
| Python Code Analysis and Debugging | **51.82** | 50.00 |
|
| 70 |
-
| Generating SQL Query (From Schema/Meta) | **39.09** | 36.36 |
|
| 71 |
-
| Cause-Effect Relationship in Historical Events (RAG) | **77.27** | 73.64 |
|
| 72 |
-
| **Overall** | **51.45** | 46.82 |
|
| 73 |
-
|
| 74 |
-
|
| 75 |
## How to Get Started with the Model
|
| 76 |
|
| 77 |
Use the code below to get started with the model.
|
|
|
|
| 52 |
- Limited to reasoning quality within biological contexts (not trained for creative writing or coding)
|
| 53 |
|
| 54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
## How to Get Started with the Model
|
| 56 |
|
| 57 |
Use the code below to get started with the model.
|