File size: 6,276 Bytes
adacb55
a918698
 
 
 
adacb55
a918698
adacb55
 
c216483
adacb55
 
a918698
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c216483
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
title: LREC 2026 LLM-as-Annotator
emoji: ✒️
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Annotate historical and low-resource languages with LLMs
---

# LREC 2026 — LLM-as-Annotator Workbench

A **corpus-centered** annotation app built around the LLM-as-annotator pipeline
described in the LREC 2026 tutorial and the companion LoResLM 2026 paper. The
text is the focal point; everything else (task schema, models, prompt, ICL
pool, exports) lives in popups behind toolbar pills.

## What it does

- Loads a corpus (paste, file, or sandbox example from the four historical languages of the paper).
- Annotates **token by token** with one or more LLMs (single inference or Mixture-of-Experts).
- Highlights MoE **disagreements** so the reviewer focuses on contested tokens first.
- Lets you correct any token in a focused popup with per-model votes, keyboard navigation, bulk operations, and a "re-ask one model" action.
- Bootstrap loop: corrected sentences feed back into the few-shot pool (filtered by `(language, schema_hash)` to avoid task contamination).
- Exports as TSV (PIE-baseline round-trip), JSON (schema-conformant), CoNLL-U (UD standard), or JSONL (fine-tune format).

## Companion paper

Vidal-Gorène, C., Kindt, B., & Cafiero, F. (2026). *Under-resourced studies of
under-resourced languages: lemmatization and POS-tagging with LLM annotators
for historical Armenian, Georgian, Greek and Syriac.* **LoResLM 2026**.
[https://aclanthology.org/2026.loreslm-1.28/](https://aclanthology.org/2026.loreslm-1.28/)

Tutorial repo: [floriancafiero/lrec2026-llm-as-annotator-tutorial](https://github.com/floriancafiero/lrec2026-llm-as-annotator-tutorial)

## Stack

- **Backend**: FastAPI + httpx (async OpenRouter client).
- **Frontend**: single static HTML page + Alpine.js (15 KB, CDN) + Tailwind CSS (CDN). No build step.

## Run locally

```bash
cd app
pip install -r requirements.txt
python app.py          # or:  uvicorn app:app --reload --port 7860
# open http://127.0.0.1:7860
```

The app expects the two sibling repos at:

```
LREC-tutorial/
├── code/
│   ├── EACL2026-historical-languages/   # sandbox corpora + tagsets
│   └── lrec2026-llm-as-annotator-tutorial/  # JSON schema + system prompts
└── app/                                 # this directory
```

## Workflow

1. **Sidebar → quick start** — click an example corpus (Ancient Greek, Old Armenian, Syriac). The toolbar updates the task, language, and models.
2. **Top bar → 🔑 OpenRouter** — paste your API key (kept in this browser session only).
3. **Top bar → ▶ Annotate all** — runs every model in parallel (Mixture-of-Experts if 2+ models). Tokens are colored by status: indigo = consensus, amber ⚠ = disagreement.
4. **Click any token** → popup with editable fields, per-model votes, keyboard navigation, "adopt from <model>" and "re-ask one model" shortcuts.
5. **📥 to ICL** on a sentence — pushes the corrected annotation into the few-shot pool. The next run re-injects it.
6. **Top bar → export** — TSV / JSON / CoNLL-U / JSONL.

### Keyboard shortcuts

| Key | Action |
|---|---|
| `j` / `k` | next / previous token |
| `e` or `↵` | edit focused token |
| `1``9` | (in editor) assign the i-th visible tag |
| `x` | toggle selection of focused token |
| `r` | re-annotate the focused sentence |
| `↵` | save edit & advance to next disagreement |
| `Esc` | close popup / clear selection |
| `shift+click` | multi-select tokens (then "Apply tag…") |
| `right-click` | per-token context menu |

## Deploy on HuggingFace Spaces

This `app/` directory is **self-contained**: the tagsets, schemas, system
prompts, cheatsheet and a slice of the four sandbox corpora are vendored under
[data/](data/) (≈ 900 KB). You do not need to push the parent repo or use git
submodules.

### One-shot deploy

```bash
cd app
# Create a new Space (Docker SDK) at https://huggingface.co/new-space
# Then push this directory as the Space's root:
git init && git add . && git commit -m "init"
git remote add space https://huggingface.co/spaces/<your-user>/<space-name>
git push --force space main
```

The Space builds from `Dockerfile`, boots `uvicorn` on port 7860, and serves
the SPA at `/`.

### ⚠ Single-user demo

`SESSION` is module-global. **The Space serves one user at a time** — if two
people open it simultaneously, they share the same corpus, the same selected
models, and (briefly) the same API key. For the LREC tutorial we recommend:

> 🦆 **Each attendee clicks the "⋮ → Duplicate this Space" button** in the
> top-right of the Space page. They get a free private clone, isolated state,
> their own API key in their own browser.

This is the simplest way to fan out the tutorial. Document this prominently on
the Space's README.

### Optional: ship a default OpenRouter key

If you want attendees to start without entering a key (e.g., a shared demo
key with a rate limit), set a Space **Secret** named `OPENROUTER_API_KEY`.
The backend reads it at startup; users can still override it from the UI.

API keys entered through the UI are **never persisted** — they live only in
the in-memory `SESSION` dict and are forgotten on restart.

## File map

| File | Role |
|---|---|
| [app.py](app.py) | FastAPI app: state + REST endpoints |
| [static/index.html](static/index.html) | SPA layout: toolbar, sidebar, corpus panel, modals |
| [static/app.js](static/app.js) | Alpine.js state + handlers + keyboard shortcuts |
| [static/styles.css](static/styles.css) | Token chips, modals, polish |
| [provider.py](provider.py) | OpenRouter async client (JSON-Schema response_format + retry) |
| [moe.py](moe.py) | Pure `aggregate()` — vote / LCS / min / priority |
| [schemas.py](schemas.py) | `AnnotationSchema` + 8 presets |
| [prompts.py](prompts.py) | Templates from tutorial repo + `ICLPool` |
| [io_utils.py](io_utils.py) | Tokenizer + TSV / JSON / CoNLL-U / JSONL I/O |
| [tutorial.py](tutorial.py) | 3 guided examples prefilling the corpus |
| [paths.py](paths.py) | Resolves sibling repos (read-only) |

## License

MIT for this app code. Sandbox data and prompt templates remain under their
upstream licenses (see the two `code/` repositories).