Spaces:

metaphilabs
/

README

Running

metaphi-ai commited on Mar 14

Commit

00de1cc

verified ·

1 Parent(s): 22db46f

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -7,4 +7,27 @@ sdk: static
 pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 pinned: false
 ---
+# Metaphi
+  Enterprise evaluation environments for long-horizon AI agents. Real workflows, production data, expert ground truth.
+  ## LH-Bench
+  Scaling evaluation of long-horizon agents on enterprise tasks:
+  - **[figma](https://huggingface.co/datasets/metaphilabs/figma)** — Figma design → production React code (33 tasks)
+  - **[long-ground](https://huggingface.co/datasets/metaphilabs/long-ground)** — Source-grounded programmatic video synthesis
+   (45 tasks)
+  - **[credit-underwriting](https://huggingface.co/datasets/metaphilabs/credit-underwriting)** — Bank statement PDF
+  extraction + transaction categorization (8 cases, 20 PDFs)
+  ## About
+  Metaphi builds environments where AI agents complete real workflows in production enterprise software — generating training
+   signal and benchmarks for AI labs.
+  - Programmatic + human evaluation
+  - Expert-curated ground truth from production workflows
+  - Pairwise human preference data (RLHF signal)
+  Website: [metaphi.ai](https://metaphi.ai)