Spaces:

karlexmarin
/

taf-agent

Running

App Files Files Community

taf-agent / registry-bootstrap /README.md

karlexmarin

v0.9.4: Launch-Flag Generator mode + Zenodo record update

12e81e6 1 day ago

preview code

raw

history blame contribute delete

5.74 kB

🌐 TAF Agent — Public Registry

Community-curated archive of TAF (Thermodynamic Attention Framework) analyses for transformer LLMs. Submitted by users of TAF Agent.

This repository stores no code. It exists purely as a public Issues board where users of the TAF Agent web tool submit their model analyses for the community to verify, refute, comment on, or reuse.

How it works

A user runs the TAF Agent on a model
They click 📤 Submit to registry
A new GitHub Issue opens with the analysis pre-filled in this repo
The user reviews, optionally adds a comment, and clicks Submit
The analysis becomes a permanent public record

Browsing

📂 All issues — every submission ever made
🟢 Verified — marked as independently verified
🔴 Refuted — empirical measurement contradicts the prediction
🔍 Search by input hash to find existing analyses for the same config: e.g. #8d29feb8 finds all analyses for the same model+T_eval+arch params

The hash system (deduplication)

Every TAF analysis is hashed from its canonical inputs. Identical inputs (same model, same T_eval, same flags) always produce the same 8-character hex hash. Different inputs produce different hashes.

This means:

Searching #a1b2c3d4 finds all submissions for the exact same config
Independent verification of an existing analysis = comment on the existing issue (not a new one)
Refutation = reply with empirical evidence, the maintainers will add the refuted label
No duplicate spam: contributors are nudged to search before submitting

What submissions look like

Each issue follows the title pattern:

[TAF Profile] Meta-Llama-3-8B @ T=32000  #8d29feb8
[TAF X-2] Meta-Llama-3-8B → YES  #a1b2c3d4
[TAF Compare] X-2 × 3 models  #c5d6e7f8

Body contains the verdict, key numbers, and a collapsible JSON of the full analysis chain. See any recent issue for examples.

Contributing

To submit an analysis

Just run the TAF Agent and click 📤 Submit to registry. The form pre-fills everything.

To verify an existing analysis

Find an issue (search by hash if you know one, or browse)
Run the same analysis yourself
If your result matches → comment "✅ Verified — [evidence link / setup details]"
A maintainer will add the verified label

To refute a prediction

Find an issue with a verdict you disagree with
Run the actual measurement (not just TAF prediction) — e.g. for Long-Context (X-2), run NIAH evaluation on real GPU
Comment with:
- Your measurement value + std
- Hardware + software setup (vLLM version, GPU, etc.)
- Repro recipe (script or command)
A maintainer will add the refuted label and link to your evidence

Refutations are first-class citizens here. The TAF framework is designed to be falsifiable — if a prediction is wrong, we want to know.

To propose a new recipe

Open an issue with title [Proposal] X-NN — <name> describing:

The practical question the recipe answers
The chain of formulas it would use
An example use case

If the recipe is feasible, the maintainer adds it to the TAF Agent codebase and labels your issue recipe-proposed.

To add a model preset

Open an issue with title [Preset] <model-id> listing:

rope_theta, max_position_embeddings, num_attention_heads, num_key_value_heads, head_dim, num_hidden_layers, n_params, has_SWA
A link to the model's HuggingFace page

These get bundled into the next release of TAF Agent.

Labels

verified — analysis independently confirmed by another user
refuted — empirical measurement contradicts TAF prediction
recipe-proposed — request for a new TAF recipe
preset-proposed — request for a new model preset
discussion — ongoing community discussion (no consensus yet)
question — clarification request
frontier — recently published model (< 1 month old) being evaluated

What we DON'T accept

Closed/proprietary model analyses without permission to share publicly
API keys, tokens, or credentials of any kind
Commercial advertisements or unrelated content
Submissions without input hash in title (suggests not from the official tool)

Code of conduct

Be technical and specific. Disagreements are about the math, not people.
Refutations require evidence. Opinions don't count, measurements do.
Cite your sources (paper sections, GitHub commits, vendor docs).
Assume good faith. Most "wrong" submissions are misunderstandings, not bad actors.

License

Submissions are released under CC0 (public domain dedication) unless otherwise noted by the contributor. The TAF Agent code itself is Apache-2.0.

🔬 TAF Agent web tool — the diagnostic itself
📦 TAF Agent source — open source
📄 Underlying paper — Marin 2026, Predicting How Transformers Attend

Maintained by Carles Marin and the TAF community.