Spaces:

karlexmarin
/

taf-agent

Running

App Files Files Community

taf-agent / registry-bootstrap /README.md

karlexmarin

v0.9.4: Launch-Flag Generator mode + Zenodo record update

12e81e6 1 day ago

preview code

raw

history blame contribute delete

5.74 kB

	# 🌐 TAF Agent — Public Registry

	> Community-curated archive of TAF (Thermodynamic Attention Framework) analyses
	> for transformer LLMs. Submitted by users of [TAF Agent](https://karlesmarin.github.io/tafagent).

	This repository stores no code. It exists purely as a public Issues board
	where users of the TAF Agent web tool submit their model analyses for the
	community to verify, refute, comment on, or reuse.

	---

	## How it works

	1. A user runs the [TAF Agent](https://karlesmarin.github.io/tafagent) on a model
	2. They click 📤 Submit to registry
	3. A new GitHub Issue opens with the analysis pre-filled in this repo
	4. The user reviews, optionally adds a comment, and clicks Submit
	5. The analysis becomes a permanent public record

	---

	## Browsing

	- 📂 [All issues](https://github.com/karlesmarin/tafagent-registry/issues) —
	every submission ever made
	- 🟢 [Verified](https://github.com/karlesmarin/tafagent-registry/issues?q=label%3Averified) —
	marked as independently verified
	- 🔴 [Refuted](https://github.com/karlesmarin/tafagent-registry/issues?q=label%3Arefuted) —
	empirical measurement contradicts the prediction
	- 🔍 Search by input hash to find existing analyses for the same config:
	e.g. `#8d29feb8` finds all analyses for the same model+T_eval+arch params

	---

	## The hash system (deduplication)

	Every TAF analysis is hashed from its canonical inputs. Identical inputs
	(same model, same T_eval, same flags) always produce the same 8-character
	hex hash. Different inputs produce different hashes.

	This means:
	- Searching `#a1b2c3d4` finds all submissions for the exact same config
	- Independent verification of an existing analysis = comment on the
	existing issue (not a new one)
	- Refutation = reply with empirical evidence, the maintainers will add
	the `refuted` label
	- No duplicate spam: contributors are nudged to search before submitting

	---

	## What submissions look like

	Each issue follows the title pattern:
	```
	[TAF Profile] Meta-Llama-3-8B @ T=32000 #8d29feb8
	[TAF X-2] Meta-Llama-3-8B → YES #a1b2c3d4
	[TAF Compare] X-2 × 3 models #c5d6e7f8
	```

	Body contains the verdict, key numbers, and a collapsible JSON of the full
	analysis chain. See any [recent issue](https://github.com/karlesmarin/tafagent-registry/issues)
	for examples.

	---

	## Contributing

	### To submit an analysis

	Just run the [TAF Agent](https://karlesmarin.github.io/tafagent) and click
	📤 Submit to registry. The form pre-fills everything.

	### To verify an existing analysis

	1. Find an issue (search by hash if you know one, or browse)
	2. Run the same analysis yourself
	3. If your result matches → comment "✅ Verified — [evidence link / setup details]"
	4. A maintainer will add the `verified` label

	### To refute a prediction

	1. Find an issue with a verdict you disagree with
	2. Run the actual measurement (not just TAF prediction) — e.g. for
	Long-Context (X-2), run NIAH evaluation on real GPU
	3. Comment with:
	- Your measurement value + std
	- Hardware + software setup (vLLM version, GPU, etc.)
	- Repro recipe (script or command)
	4. A maintainer will add the `refuted` label and link to your evidence

	Refutations are first-class citizens here. The TAF framework is designed to
	be falsifiable — if a prediction is wrong, we want to know.

	### To propose a new recipe

	Open an issue with title `[Proposal] X-NN — <name>` describing:
	- The practical question the recipe answers
	- The chain of formulas it would use
	- An example use case

	If the recipe is feasible, the maintainer adds it to the
	[TAF Agent codebase](https://github.com/karlesmarin/tafagent) and labels
	your issue `recipe-proposed`.

	### To add a model preset

	Open an issue with title `[Preset] <model-id>` listing:
	- `rope_theta`, `max_position_embeddings`, `num_attention_heads`,
	`num_key_value_heads`, `head_dim`, `num_hidden_layers`, `n_params`,
	`has_SWA`
	- A link to the model's HuggingFace page

	These get bundled into the next release of TAF Agent.

	---

	## Labels

	- `verified` — analysis independently confirmed by another user
	- `refuted` — empirical measurement contradicts TAF prediction
	- `recipe-proposed` — request for a new TAF recipe
	- `preset-proposed` — request for a new model preset
	- `discussion` — ongoing community discussion (no consensus yet)
	- `question` — clarification request
	- `frontier` — recently published model (< 1 month old) being evaluated

	---

	## What we DON'T accept

	- Closed/proprietary model analyses without permission to share publicly
	- API keys, tokens, or credentials of any kind
	- Commercial advertisements or unrelated content
	- Submissions without input hash in title (suggests not from the official tool)

	---

	## Code of conduct

	- Be technical and specific. Disagreements are about the math, not people.
	- Refutations require evidence. Opinions don't count, measurements do.
	- Cite your sources (paper sections, GitHub commits, vendor docs).
	- Assume good faith. Most "wrong" submissions are misunderstandings,
	not bad actors.

	---

	## License

	Submissions are released under [CC0 (public domain dedication)](https://creativecommons.org/publicdomain/zero/1.0/)
	unless otherwise noted by the contributor. The TAF Agent code itself is
	[Apache-2.0](https://github.com/karlesmarin/tafagent/blob/main/LICENSE).

	---

	## Related

	- 🔬 [TAF Agent web tool](https://karlesmarin.github.io/tafagent) — the diagnostic itself
	- 📦 [TAF Agent source](https://github.com/karlesmarin/tafagent) — open source
	- 📄 [Underlying paper](https://zenodo.org/records/20314038) — Marin 2026,
	Predicting How Transformers Attend

	---

	Maintained by Carles Marin and the TAF community.