Replace tilde-approximation with Unicode ≈ to fix accidental GFM strikethrough rendering
Browse files
README.md
CHANGED
|
@@ -32,10 +32,10 @@ NRI is a pretrained neural model that, given a small set of labelled Boolean exa
|
|
| 32 |
| Field | Value |
|
| 33 |
| --- | --- |
|
| 34 |
| Architecture | Statistical literal encoder + parallel slot-based set decoder + t-norm/t-conorm aggregator |
|
| 35 |
-
| Parameters |
|
| 36 |
| Output | Interpretable DNF rule (T_max=8 clauses × K_max=4 literals each) |
|
| 37 |
| Training data | Synthetic Boolean DNF episodes (no real-world labels) |
|
| 38 |
-
| Training compute | 500 steps, batch size 8192, 1 × NVIDIA RTX 6000 Pro (96 GB),
|
| 39 |
| Seed | 42 |
|
| 40 |
|
| 41 |
The model is **pretrained**, not fine-tuned. It performs rule induction zero-shot at inference time on previously unseen tasks.
|
|
@@ -75,7 +75,7 @@ NRI is evaluated zero-shot on 14 UCI tabular benchmarks. **Direct comparison bet
|
|
| 75 |
|
| 76 |
| Setting | Eval protocol | Seeds | Mean acc. |
|
| 77 |
| --- | --- | --- | --- |
|
| 78 |
-
| **This checkpoint (release reference)** | 5-fold CV; **1 fold (
|
| 79 |
| **Paper Table 1** | 5-fold CV; **train portion subsampled to 5% before induction** (≈4% of total as support, 20% as query) | 10 | 69.7 % ± 12.0 |
|
| 80 |
|
| 81 |
The released checkpoint has **roughly 5× more support data per fold** than the paper's protocol, which is the dominant reason its UCI accuracy is higher (+5.9 pp) than the paper's 69.7 %. The paper's protocol deliberately targets a low-data regime where zero-shot transfer is most valuable.
|
|
|
|
| 32 |
| Field | Value |
|
| 33 |
| --- | --- |
|
| 34 |
| Architecture | Statistical literal encoder + parallel slot-based set decoder + t-norm/t-conorm aggregator |
|
| 35 |
+
| Parameters | ≈8.92 M |
|
| 36 |
| Output | Interpretable DNF rule (T_max=8 clauses × K_max=4 literals each) |
|
| 37 |
| Training data | Synthetic Boolean DNF episodes (no real-world labels) |
|
| 38 |
+
| Training compute | 500 steps, batch size 8192, 1 × NVIDIA RTX 6000 Pro (96 GB), ≈2.5 minutes |
|
| 39 |
| Seed | 42 |
|
| 40 |
|
| 41 |
The model is **pretrained**, not fine-tuned. It performs rule induction zero-shot at inference time on previously unseen tasks.
|
|
|
|
| 75 |
|
| 76 |
| Setting | Eval protocol | Seeds | Mean acc. |
|
| 77 |
| --- | --- | --- | --- |
|
| 78 |
+
| **This checkpoint (release reference)** | 5-fold CV; **1 fold (≈20%) used as support, 4 folds (≈80%) as query**; no subsampling | 1 (seed 42) | **75.60 %** |
|
| 79 |
| **Paper Table 1** | 5-fold CV; **train portion subsampled to 5% before induction** (≈4% of total as support, 20% as query) | 10 | 69.7 % ± 12.0 |
|
| 80 |
|
| 81 |
The released checkpoint has **roughly 5× more support data per fold** than the paper's protocol, which is the dominant reason its UCI accuracy is higher (+5.9 pp) than the paper's 69.7 %. The paper's protocol deliberately targets a low-data regime where zero-shot transfer is most valuable.
|