Replace tilde-approximation with Unicode ≈ to fix accidental GFM strikethrough rendering

1a57d7b verified about 13 hours ago

6.11 kB

library_name: rule-inducer
license: mit
pipeline_tag: tabular-classification
tags:
  - rule-induction
  - neuro-symbolic
  - logic-programming
  - ilp
  - interpretability
  - zero-shot
  - pytorch
  - model_hub_mixin
  - pytorch_model_hub_mixin
arxiv: 2605.04916
papers:
  - 2605.04916

Neural Rule Inducer (NRI)

Zero-shot induction of interpretable DNF rules from Boolean examples.

NRI is a pretrained neural model that, given a small set of labelled Boolean examples for a new binary classification task, produces an interpretable disjunctive normal form (DNF) rule that explains the labels — without any task-specific fine-tuning. Instead of encoding literal identities, it represents literals using domain-agnostic statistical properties (class-conditional rates, entropy, co-occurrence), which generalize across variable identities and counts.

📄 Paper: A Foundation Model for Zero-Shot Logical Rule Induction (IJCAI 2026)
💻 Code: github.com/phuayj/neural-rule-inducer
🏷️ License: MIT

Model details

Field	Value
Architecture	Statistical literal encoder + parallel slot-based set decoder + t-norm/t-conorm aggregator
Parameters	≈8.92 M
Output	Interpretable DNF rule (T_max=8 clauses × K_max=4 literals each)
Training data	Synthetic Boolean DNF episodes (no real-world labels)
Training compute	500 steps, batch size 8192, 1 × NVIDIA RTX 6000 Pro (96 GB), ≈2.5 minutes
Seed	42

The model is pretrained, not fine-tuned. It performs rule induction zero-shot at inference time on previously unseen tasks.

Quickstart

pip install torch huggingface_hub
# Then install the rule_inducer package from the GitHub repo:
git clone https://github.com/phuayj/neural-rule-inducer.git
cd neural-rule-inducer
pip install -e .

import torch
from rule_inducer import RuleInducer

model = RuleInducer.from_pretrained("phuayj/neural-rule-inducer")
model.eval()
# See evaluate_uci.py for the full episode-construction and inference loop.

For evaluating on tabular datasets in the UCI format (X_bool.npy of shape [M, N] in {0, 1, NaN} and y.npy of shape [M]):

# Download the legacy .pt checkpoint with full training state (optimizer, scheduler, RNG) from this repo:
python -c "from huggingface_hub import hf_hub_download; print(hf_hub_download('phuayj/neural-rule-inducer', 'checkpoint_best.pt'))"

# Use the original evaluation script:
python evaluate_uci.py --checkpoint <downloaded_path> --data-dir data/uci --all

Reported performance

NRI is evaluated zero-shot on 14 UCI tabular benchmarks. Direct comparison between this checkpoint and the paper's Table 1 number is not apples-to-apples, because the two use different evaluation protocols:

Setting	Eval protocol	Seeds	Mean acc.
This checkpoint (release reference)	5-fold CV; 1 fold (≈20%) used as support, 4 folds (≈80%) as query; no subsampling	1 (seed 42)	75.60 %
Paper Table 1	5-fold CV; train portion subsampled to 5% before induction (≈4% of total as support, 20% as query)	10	69.7 % ± 12.0

The released checkpoint has roughly 5× more support data per fold than the paper's protocol, which is the dominant reason its UCI accuracy is higher (+5.9 pp) than the paper's 69.7 %. The paper's protocol deliberately targets a low-data regime where zero-shot transfer is most valuable.

To reproduce the paper's protocol exactly, you need the (private) full evaluation harness with train_percentage=5.0 subsampling; the public evaluate_uci.py shipped with the GitHub repo uses the simpler 1-fold-as-support setup shown above. We plan to add --train-percentage support to the public script in a future release.

Per-dataset accuracies for this checkpoint under the public protocol:

Dataset	This checkpoint (1 seed, 20% support)	Paper Table 1 (10 seeds, 5%-subsampled support)
adult	65.01	69.6 ± 4.4
breast-cancer-wisconsin	91.42	88.3 ± 0.3
car	71.07	51.2 ± 5.4
credit-approval	85.11	71.5 ± 7.3
diabetes	69.50	68.0 ± 2.4
german-credit	69.58	59.8 ± 3.8
hepatitis	66.78	55.9 ± 3.7
ionosphere	74.93	62.8 ± 3.9
kr-vs-kp	67.77	72.3 ± 5.3
mushroom	89.40	87.8 ± 3.9
nursery	74.56	71.3 ± 4.3
spambase	70.71	71.9 ± 0.7
tic-tac-toe	69.94	56.6 ± 2.5
vote	92.59	88.3 ± 1.8
Mean	75.60	69.7 ± 12.0

(Per-dataset paper numbers are mean ± std across 10 seeds; the ±12.0 on the bottom row is std across the 14 datasets, not across seeds.)

Files in this repo

File	Purpose
`pytorch_model.bin`	Inference weights (loaded by `RuleInducer.from_pretrained`)
`config.json`	Model architecture config
`checkpoint_best.pt`	Full training-state checkpoint (optimizer, scheduler, RNG, metrics) for resumption or audit

Limitations

Inputs are assumed to be Boolean ({0, 1} plus optional NaN for missing values). Continuous features must be binarized first; the paper uses median thresholding.
Designed for binary classification. Multi-class is handled by one-vs-rest at evaluation time, not at the rule level.
The model induces DNF rules; tasks that require non-DNF representations (e.g. recursive predicates, arithmetic) are out of scope.
Performance on a single dataset is sensitive to the support/query split and the binarization choice.

Citation

The paper has been accepted at IJCAI 2026; the proceedings reference is not yet finalized. Please use the arXiv entry for now and update once the IJCAI BibTeX is published.

@misc{Phua2026NRI,
  title  = {A Foundation Model for Zero-Shot Logical Rule Induction},
  author = {Phua, Yin Jun},
  year   = {2026},
  eprint = {2605.04916},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG},
  note   = {To appear at IJCAI 2026; full proceedings citation TBD}
}