pawlaszc
/

DigitalForensicsText2SQLite

Text Generation

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

pawlaszc commited on Mar 12

Commit

3bd524d

·

verified ·

1 Parent(s): 4b4ad8b

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -78,7 +78,7 @@ analysis tool.
 |---|---|
 | **Overall Accuracy** | **93.0%** (93/100) |
 | 95% CI (Wilson) | [86.3%, 96.6%] |
-| Executable Queries | 92/100 |
 | GPT-4o Accuracy | 95.0% (gap: 4 pp, p ≈ 0.39) |
 | Base Model (no fine-tuning) | 35.0% |
 | Improvement over base | +56 pp |
@@ -130,7 +130,7 @@ configuration without app name is recommended for general use.
 |---|---|---|---|
 | Base model (no fine-tuning) | — | 35.0% | — |
 | Fine-tuned, no augmentation | — | 68.0% | +33 pp |
-| + Data augmentation (3.4×) | — | 74.0% | +6 pp |
 | + Extended training (7 epochs) | 0.3617 | 92.0% | +10 pp |
 | + Post-processing pipeline | 0.3617 | 87.0% | +3 pp |
 | + Execution feedback | 0.3617 | 90.0% | +3 pp |
@@ -246,7 +246,7 @@ class ForensicSQLGenerator:
             "SQLite Query:\n"
         )
         inputs = self.tokenizer(
-            prompt, return_tensors="pt", truncation=True, max_length=2048
         )
         inputs = {k: v.to(self.model.device) for k, v in inputs.items()}
         input_length = inputs["input_ids"].shape[1]
@@ -318,7 +318,7 @@ ollama run forensic-sql
 | Learning rate | 2e-5 (peak) |
 | LR scheduler | Cosine with warmup |
 | Batch size | 1 + gradient accumulation 4 |
-| Max sequence length | 2048 |
 | Optimizer | AdamW |
 | Hardware | Apple M-series, 16 GB unified memory |
 | Training time | ~17.6 hours |

 |---|---|
 | **Overall Accuracy** | **93.0%** (93/100) |
 | 95% CI (Wilson) | [86.3%, 96.6%] |
+| Executable Queries | 94/100 |
 | GPT-4o Accuracy | 95.0% (gap: 4 pp, p ≈ 0.39) |
 | Base Model (no fine-tuning) | 35.0% |
 | Improvement over base | +56 pp |
 |---|---|---|---|
 | Base model (no fine-tuning) | — | 35.0% | — |
 | Fine-tuned, no augmentation | — | 68.0% | +33 pp |
+| + Data augmentation (2.4×) | — | 74.0% | +6 pp |
 | + Extended training (7 epochs) | 0.3617 | 92.0% | +10 pp |
 | + Post-processing pipeline | 0.3617 | 87.0% | +3 pp |
 | + Execution feedback | 0.3617 | 90.0% | +3 pp |
             "SQLite Query:\n"
         )
         inputs = self.tokenizer(
+            prompt, return_tensors="pt", truncation=True, max_length=4096
         )
         inputs = {k: v.to(self.model.device) for k, v in inputs.items()}
         input_length = inputs["input_ids"].shape[1]
 | Learning rate | 2e-5 (peak) |
 | LR scheduler | Cosine with warmup |
 | Batch size | 1 + gradient accumulation 4 |
+| Max sequence length | 4096 |
 | Optimizer | AdamW |
 | Hardware | Apple M-series, 16 GB unified memory |
 | Training time | ~17.6 hours |