amarck commited on
Commit
a0f27fa
·
0 Parent(s):

Initial commit: Research Intelligence System

Browse files

Self-hosted research paper triage with AI scoring, preference learning,
and a setup wizard for first-time configuration.

Files changed (47) hide show
  1. .dockerignore +17 -0
  2. .env.example +5 -0
  3. .gitignore +32 -0
  4. CLAUDE.md +73 -0
  5. Dockerfile +28 -0
  6. README.md +128 -0
  7. data/seed_papers.json +182 -0
  8. docker-compose.yml +27 -0
  9. entrypoint.sh +7 -0
  10. requirements.txt +17 -0
  11. scripts/backup-db.sh +35 -0
  12. src/__init__.py +0 -0
  13. src/config.py +505 -0
  14. src/db.py +870 -0
  15. src/pipelines/__init__.py +0 -0
  16. src/pipelines/aiml.py +327 -0
  17. src/pipelines/events.py +196 -0
  18. src/pipelines/github.py +194 -0
  19. src/pipelines/security.py +252 -0
  20. src/pipelines/semantic_scholar.py +294 -0
  21. src/preferences.py +343 -0
  22. src/scheduler.py +66 -0
  23. src/scoring.py +186 -0
  24. src/web/__init__.py +0 -0
  25. src/web/app.py +983 -0
  26. src/web/static/favicon-192.png +0 -0
  27. src/web/static/favicon-512.png +0 -0
  28. src/web/static/favicon.svg +44 -0
  29. src/web/static/htmx.min.js +1 -0
  30. src/web/static/manifest.json +32 -0
  31. src/web/static/style.css +1701 -0
  32. src/web/static/sw.js +79 -0
  33. src/web/templates/base.html +60 -0
  34. src/web/templates/dashboard.html +135 -0
  35. src/web/templates/events.html +91 -0
  36. src/web/templates/github.html +43 -0
  37. src/web/templates/paper_detail.html +205 -0
  38. src/web/templates/papers.html +49 -0
  39. src/web/templates/partials/github_results.html +83 -0
  40. src/web/templates/partials/paper_card.html +29 -0
  41. src/web/templates/partials/paper_row.html +41 -0
  42. src/web/templates/partials/papers_results.html +55 -0
  43. src/web/templates/partials/signal_buttons.html +17 -0
  44. src/web/templates/preferences.html +85 -0
  45. src/web/templates/seed_preferences.html +178 -0
  46. src/web/templates/setup.html +596 -0
  47. src/web/templates/weeks.html +83 -0
.dockerignore ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .git
2
+ .gitignore
3
+ .env
4
+ .claude
5
+ __pycache__
6
+ *.pyc
7
+ *.pyo
8
+ data/*.db
9
+ data/*.db-wal
10
+ data/*.db-shm
11
+ data/weeks/
12
+ .pytest_cache
13
+ .mypy_cache
14
+ .ruff_cache
15
+ README.md
16
+ CLAUDE.md
17
+ config.yaml
.env.example ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ # Required: Anthropic API key for paper scoring
2
+ ANTHROPIC_API_KEY=your-key-here
3
+
4
+ # Optional: GitHub token for higher API rate limits
5
+ GITHUB_TOKEN=
.gitignore ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment
2
+ .env
3
+
4
+ # Database
5
+ data/*.db
6
+ data/*.db-wal
7
+ data/*.db-shm
8
+ data/backups/
9
+ data/weeks/
10
+
11
+ # Python
12
+ __pycache__/
13
+ *.pyc
14
+ *.pyo
15
+ .pytest_cache/
16
+ .mypy_cache/
17
+ .ruff_cache/
18
+
19
+ # IDE / Editor
20
+ .claude/
21
+ .vscode/
22
+ .idea/
23
+ *.swp
24
+ *.swo
25
+ *~
26
+
27
+ # OS
28
+ .DS_Store
29
+ Thumbs.db
30
+
31
+ # Generated config (created by setup wizard)
32
+ config.yaml
CLAUDE.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Research Intelligence System
2
+
3
+ ## Architecture
4
+
5
+ - **Web dashboard**: FastAPI + Jinja2 + HTMX on port 8888
6
+ - **Database**: SQLite at `data/researcher.db` (configurable in `config.yaml`)
7
+ - **Config**: YAML-driven via `config.yaml` (generated by setup wizard on first run)
8
+ - **Pipelines**: `src/pipelines/aiml.py` (HF + arXiv), `src/pipelines/security.py` (arXiv cs.CR)
9
+ - **Scoring**: `src/scoring.py` — Claude API batch scoring with configurable axes
10
+ - **Preferences**: `src/preferences.py` — learns from user signals (upvote/downvote/save/dismiss)
11
+ - **Scheduler**: APScheduler runs on configurable cron schedule
12
+
13
+ ## Key Files
14
+
15
+ | File | Purpose |
16
+ |------|---------|
17
+ | `src/config.py` | YAML config loader, scoring prompt builder, defaults |
18
+ | `src/db.py` | SQLite schema + query helpers |
19
+ | `src/scoring.py` | Unified Claude API scorer |
20
+ | `src/preferences.py` | Preference computation from user signals |
21
+ | `src/pipelines/aiml.py` | AI/ML paper fetching (HF + arXiv) |
22
+ | `src/pipelines/security.py` | Security paper fetching (arXiv cs.CR) |
23
+ | `src/pipelines/github.py` | GitHub trending projects via OSSInsight |
24
+ | `src/pipelines/events.py` | Conferences, releases, RSS news |
25
+ | `src/web/app.py` | FastAPI routes, middleware, report generation |
26
+ | `src/scheduler.py` | APScheduler weekly trigger |
27
+
28
+ ## Config System
29
+
30
+ `src/config.py` loads `config.yaml` and exposes module-level constants:
31
+
32
+ - `FIRST_RUN` — True when `config.yaml` doesn't exist (triggers setup wizard)
33
+ - `SCORING_CONFIGS` — Dict of domain scoring configs (axes, weights, prompts)
34
+ - `DB_PATH` — Path to SQLite database
35
+ - `ANTHROPIC_API_KEY` — From `.env` or environment
36
+
37
+ Scoring prompts are built dynamically from `scoring_axes` and `preferences` in config.
38
+
39
+ ## Working with the Database
40
+
41
+ ```bash
42
+ sqlite3 data/researcher.db
43
+
44
+ # Top papers
45
+ SELECT title, composite, summary FROM papers
46
+ WHERE domain='aiml' AND composite IS NOT NULL
47
+ ORDER BY composite DESC LIMIT 10;
48
+
49
+ # Signal counts
50
+ SELECT action, COUNT(*) FROM signals GROUP BY action;
51
+
52
+ # Preference profile
53
+ SELECT * FROM preferences ORDER BY abs(pref_value) DESC LIMIT 20;
54
+ ```
55
+
56
+ ## Docker
57
+
58
+ ```bash
59
+ docker compose up --build
60
+ # Dashboard at http://localhost:9090
61
+ # Setup wizard runs on first visit
62
+
63
+ # Trigger pipelines
64
+ curl -X POST http://localhost:9090/run/aiml
65
+ curl -X POST http://localhost:9090/run/security
66
+ ```
67
+
68
+ ## Allowed Tools
69
+
70
+ When working with this project in Claude Code:
71
+ - **Bash**: python, sqlite3, curl, docker commands
72
+ - **WebSearch/WebFetch**: arXiv, GitHub, HuggingFace for paper details
73
+ - **Read/Edit**: all project files and data/
Dockerfile ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install dependencies (cached layer)
6
+ COPY requirements.txt .
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ # Create non-root user
10
+ RUN useradd -m -s /bin/bash appuser
11
+
12
+ # Copy source
13
+ COPY src/ src/
14
+ COPY data/seed_papers.json data/seed_papers.json
15
+ COPY entrypoint.sh .
16
+ RUN chmod +x entrypoint.sh
17
+
18
+ # Create data directory with correct ownership
19
+ RUN mkdir -p data/weeks && chown -R appuser:appuser /app
20
+
21
+ USER appuser
22
+
23
+ EXPOSE 8888
24
+
25
+ HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=15s \
26
+ CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8888/api/status')"
27
+
28
+ ENTRYPOINT ["./entrypoint.sh"]
README.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Research Intelligence
2
+
3
+ A self-hosted research triage system that monitors academic papers (AI/ML and Security) and trending GitHub projects, scores them with Claude, and learns your preferences over time.
4
+
5
+ ## Features
6
+
7
+ - **Paper monitoring** — Fetches new papers from arXiv and HuggingFace daily/weekly
8
+ - **AI scoring** — Claude scores each paper on configurable axes (novelty, code availability, practical impact)
9
+ - **Preference learning** — Rate papers with thumbs up/down; the system learns what you care about and re-ranks accordingly
10
+ - **GitHub tracking** — Monitors trending repositories across curated collections
11
+ - **Event tracking** — Conference deadlines, releases, and RSS news feeds
12
+ - **Weekly reports** — Auto-generated markdown summaries of top papers
13
+ - **Dark-theme dashboard** — Fast, responsive web UI built with HTMX
14
+
15
+ ## Quick Start
16
+
17
+ ### Docker (recommended)
18
+
19
+ ```bash
20
+ git clone https://github.com/yourname/researcher.git
21
+ cd researcher
22
+ cp .env.example .env
23
+ # Edit .env and add your Anthropic API key
24
+
25
+ docker compose up --build
26
+ ```
27
+
28
+ Visit **http://localhost:9090** — the setup wizard will guide you through configuration.
29
+
30
+ ### Local
31
+
32
+ ```bash
33
+ git clone https://github.com/yourname/researcher.git
34
+ cd researcher
35
+ pip install -r requirements.txt
36
+ cp .env.example .env
37
+ # Edit .env and add your Anthropic API key
38
+
39
+ python -m uvicorn src.web.app:app --host 0.0.0.0 --port 8888
40
+ ```
41
+
42
+ Visit **http://localhost:8888** and follow the setup wizard.
43
+
44
+ ## Setup Wizard
45
+
46
+ On first launch (before `config.yaml` exists), you'll be guided through:
47
+
48
+ 1. **API Key** — Enter your Anthropic API key (validated with a test call)
49
+ 2. **Domains** — Enable/disable AI/ML and Security monitoring, adjust scoring weights
50
+ 3. **GitHub** — Toggle GitHub project tracking
51
+ 4. **Schedule** — Set pipeline frequency (daily, weekly, or manual-only)
52
+
53
+ After setup, you can optionally **pick seed papers** to bootstrap your preference profile.
54
+
55
+ ## Configuration
56
+
57
+ All settings live in `config.yaml` (generated by the setup wizard). You can also edit it directly:
58
+
59
+ ```yaml
60
+ domains:
61
+ aiml:
62
+ enabled: true
63
+ scoring_axes:
64
+ - name: "Code & Weights"
65
+ weight: 0.30
66
+ - name: "Novelty"
67
+ weight: 0.35
68
+ - name: "Practical Applicability"
69
+ weight: 0.35
70
+ security:
71
+ enabled: true
72
+ scoring_axes:
73
+ - name: "Has Code/PoC"
74
+ weight: 0.25
75
+ - name: "Novel Attack Surface"
76
+ weight: 0.40
77
+ - name: "Real-World Impact"
78
+ weight: 0.35
79
+
80
+ schedule:
81
+ cron: "0 22 * * 0" # Weekly on Sunday at 22:00 UTC
82
+ ```
83
+
84
+ ## Architecture
85
+
86
+ | Component | Technology |
87
+ |-----------|-----------|
88
+ | Web server | FastAPI + Jinja2 + HTMX |
89
+ | Database | SQLite |
90
+ | Scoring | Claude API (Anthropic) |
91
+ | Scheduling | APScheduler |
92
+ | Container | Docker |
93
+
94
+ ### Key Files
95
+
96
+ | File | Purpose |
97
+ |------|---------|
98
+ | `src/config.py` | YAML config loader with defaults |
99
+ | `src/db.py` | SQLite schema and queries |
100
+ | `src/scoring.py` | Claude API batch scorer |
101
+ | `src/preferences.py` | Preference learning from user signals |
102
+ | `src/pipelines/aiml.py` | AI/ML paper fetcher (HF + arXiv) |
103
+ | `src/pipelines/security.py` | Security paper fetcher (arXiv cs.CR) |
104
+ | `src/pipelines/github.py` | GitHub trending projects |
105
+ | `src/pipelines/events.py` | Conferences, releases, RSS |
106
+ | `src/web/app.py` | Web routes and middleware |
107
+ | `src/scheduler.py` | Cron-based pipeline scheduler |
108
+
109
+ ## Running Pipelines Manually
110
+
111
+ From the dashboard, click the pipeline buttons. Or via API:
112
+
113
+ ```bash
114
+ curl -X POST http://localhost:9090/run/aiml
115
+ curl -X POST http://localhost:9090/run/security
116
+ curl -X POST http://localhost:9090/run/github
117
+ curl -X POST http://localhost:9090/run/events
118
+ ```
119
+
120
+ ## Requirements
121
+
122
+ - Python 3.12+
123
+ - Anthropic API key (for paper scoring)
124
+ - Optional: GitHub token (for higher API rate limits)
125
+
126
+ ## License
127
+
128
+ MIT
data/seed_papers.json ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "arxiv_id": "2401.04088",
4
+ "title": "DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence",
5
+ "domain": "aiml",
6
+ "summary": "Open-source code LLM matching GPT-4 Turbo on coding benchmarks with MoE architecture."
7
+ },
8
+ {
9
+ "arxiv_id": "2403.05530",
10
+ "title": "GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection",
11
+ "domain": "aiml",
12
+ "summary": "Reduces memory usage for LLM training via gradient projection, enabling 7B training on consumer GPUs."
13
+ },
14
+ {
15
+ "arxiv_id": "2402.13616",
16
+ "title": "World Model on Million-Length Video and Language with RingAttention",
17
+ "domain": "aiml",
18
+ "summary": "Trains world models on million-token video sequences using ring attention for long context."
19
+ },
20
+ {
21
+ "arxiv_id": "2403.03206",
22
+ "title": "The Claude 3 Model Family",
23
+ "domain": "aiml",
24
+ "summary": "Multimodal LLM family with strong vision capabilities and extended context windows."
25
+ },
26
+ {
27
+ "arxiv_id": "2402.17764",
28
+ "title": "Sora: A Review on Background, Technology, Limitations, and Opportunities",
29
+ "domain": "aiml",
30
+ "summary": "Analysis of video generation model capabilities, architecture, and limitations."
31
+ },
32
+ {
33
+ "arxiv_id": "2401.02954",
34
+ "title": "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts",
35
+ "domain": "aiml",
36
+ "summary": "Combines Mamba state-space model with mixture-of-experts for efficient scaling."
37
+ },
38
+ {
39
+ "arxiv_id": "2403.09611",
40
+ "title": "Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking",
41
+ "domain": "aiml",
42
+ "summary": "Self-taught reasoning where LLMs learn to generate internal rationale tokens."
43
+ },
44
+ {
45
+ "arxiv_id": "2402.01032",
46
+ "title": "OLMo: Accelerating the Science of Language Models",
47
+ "domain": "aiml",
48
+ "summary": "Fully open-source LLM with released weights, code, data, and training logs."
49
+ },
50
+ {
51
+ "arxiv_id": "2403.14608",
52
+ "title": "ReALM: Reference Resolution As Language Modeling",
53
+ "domain": "aiml",
54
+ "summary": "Resolves onscreen and conversational references using LLMs for device agents."
55
+ },
56
+ {
57
+ "arxiv_id": "2402.14261",
58
+ "title": "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models",
59
+ "domain": "aiml",
60
+ "summary": "Hybrid architecture combining gated linear RNNs with local attention, matching transformer quality."
61
+ },
62
+ {
63
+ "arxiv_id": "2401.14196",
64
+ "title": "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers",
65
+ "domain": "aiml",
66
+ "summary": "One-shot quantization method reducing LLM size to 3-4 bits with minimal accuracy loss."
67
+ },
68
+ {
69
+ "arxiv_id": "2403.07691",
70
+ "title": "Stealing Part of a Production Language Model",
71
+ "domain": "security",
72
+ "summary": "Extracts internal architecture details from production LLM APIs through crafted queries."
73
+ },
74
+ {
75
+ "arxiv_id": "2402.06132",
76
+ "title": "SoK: Where's the Bug? A Study of Bug Localization Tools",
77
+ "domain": "security",
78
+ "summary": "Systematizes bug localization approaches and evaluates 23 tools on real-world CVEs."
79
+ },
80
+ {
81
+ "arxiv_id": "2401.16727",
82
+ "title": "A Survey of Side-Channel Attacks on Intel SGX",
83
+ "domain": "security",
84
+ "summary": "Comprehensive analysis of side-channel attacks targeting Intel SGX enclaves."
85
+ },
86
+ {
87
+ "arxiv_id": "2403.02783",
88
+ "title": "SyzVegas: Beating Kernel Fuzzing Odds with Reinforcement Learning",
89
+ "domain": "security",
90
+ "summary": "RL-guided kernel fuzzer that outperforms Syzkaller in bug discovery rate."
91
+ },
92
+ {
93
+ "arxiv_id": "2402.15483",
94
+ "title": "BSIMM: An Empirical Study of 130 Software Security Programs",
95
+ "domain": "security",
96
+ "summary": "Large-scale study of enterprise security maturity across 130 organizations."
97
+ },
98
+ {
99
+ "arxiv_id": "2403.14469",
100
+ "title": "Reverse Engineering eBPF Programs: Challenges and Approaches",
101
+ "domain": "security",
102
+ "summary": "Novel techniques for reverse engineering eBPF bytecode in Linux kernel security."
103
+ },
104
+ {
105
+ "arxiv_id": "2401.09577",
106
+ "title": "WiFi-Based Keystroke Inference Attack Using Adversarial CSI Perturbation",
107
+ "domain": "security",
108
+ "summary": "Exploits WiFi channel state information to infer keystrokes from nearby devices."
109
+ },
110
+ {
111
+ "arxiv_id": "2402.08787",
112
+ "title": "Binary Code Similarity Detection via Graph Neural Networks",
113
+ "domain": "security",
114
+ "summary": "GNN-based approach to detect similar binary functions across compilers and architectures."
115
+ },
116
+ {
117
+ "arxiv_id": "2403.01218",
118
+ "title": "Practical Exploitation of DNS Rebinding in IoT Devices",
119
+ "domain": "security",
120
+ "summary": "Demonstrates DNS rebinding attacks against 15 popular IoT devices in home networks."
121
+ },
122
+ {
123
+ "arxiv_id": "2401.15491",
124
+ "title": "GPU.zip: Side Channel Attacks on GPU-Based Graphical Data Compression",
125
+ "domain": "security",
126
+ "summary": "First cross-origin pixel-stealing attack through GPU hardware data compression."
127
+ },
128
+ {
129
+ "arxiv_id": "2402.03367",
130
+ "title": "CryptoFuzz: Fully Automated Testing of Cryptographic API Misuse",
131
+ "domain": "security",
132
+ "summary": "Automated fuzzer detecting cryptographic API misuse patterns in Java applications."
133
+ },
134
+ {
135
+ "arxiv_id": "2403.08946",
136
+ "title": "Video Generation Models as World Simulators",
137
+ "domain": "aiml",
138
+ "summary": "Explores how video generation models learn physical world dynamics as implicit simulators."
139
+ },
140
+ {
141
+ "arxiv_id": "2402.05929",
142
+ "title": "V-JEPA: Video Joint Embedding Predictive Architecture",
143
+ "domain": "aiml",
144
+ "summary": "Self-supervised video representation learning that predicts in latent space rather than pixel space."
145
+ },
146
+ {
147
+ "arxiv_id": "2401.10020",
148
+ "title": "AlphaGeometry: Solving Olympiad Geometry without Human Demonstrations",
149
+ "domain": "aiml",
150
+ "summary": "AI system solving IMO-level geometry problems through neurosymbolic reasoning."
151
+ },
152
+ {
153
+ "arxiv_id": "2403.04132",
154
+ "title": "Design2Code: How Far Are We From Automating Front-End Engineering?",
155
+ "domain": "aiml",
156
+ "summary": "Benchmarks multimodal LLMs on converting visual designs to functional HTML/CSS code."
157
+ },
158
+ {
159
+ "arxiv_id": "2402.14905",
160
+ "title": "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information",
161
+ "domain": "aiml",
162
+ "summary": "New YOLO architecture using programmable gradient information for better object detection."
163
+ },
164
+ {
165
+ "arxiv_id": "2401.06066",
166
+ "title": "MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation",
167
+ "domain": "aiml",
168
+ "summary": "Multi-stage video generation pipeline producing high-quality aesthetic videos from text."
169
+ },
170
+ {
171
+ "arxiv_id": "2402.01680",
172
+ "title": "Grandmaster-Level Chess Without Search",
173
+ "domain": "aiml",
174
+ "summary": "Transformer achieving grandmaster chess play through pure pattern recognition without tree search."
175
+ },
176
+ {
177
+ "arxiv_id": "2403.04706",
178
+ "title": "SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering",
179
+ "domain": "aiml",
180
+ "summary": "LLM agent that autonomously fixes GitHub issues by interacting with code repositories."
181
+ }
182
+ ]
docker-compose.yml ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ services:
2
+ researcher:
3
+ build: .
4
+ ports:
5
+ - "9090:8888"
6
+ volumes:
7
+ - ./data:/app/data
8
+ environment:
9
+ - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
10
+ - GITHUB_TOKEN=${GITHUB_TOKEN}
11
+ - PYTHONUNBUFFERED=1
12
+ restart: unless-stopped
13
+ healthcheck:
14
+ test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8888/api/status')"]
15
+ interval: 30s
16
+ timeout: 10s
17
+ retries: 3
18
+ start_period: 15s
19
+ logging:
20
+ driver: json-file
21
+ options:
22
+ max-size: "10m"
23
+ max-file: "3"
24
+ deploy:
25
+ resources:
26
+ limits:
27
+ memory: 2g
entrypoint.sh ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ set -e
3
+
4
+ echo "=== Research Intelligence ==="
5
+ echo "Starting web server + scheduler on port 8888 ..."
6
+
7
+ exec python -m uvicorn src.web.app:app --host 0.0.0.0 --port 8888
requirements.txt ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core web
2
+ fastapi>=0.115,<1
3
+ uvicorn>=0.34,<1
4
+ jinja2>=3.1,<4
5
+ python-multipart>=0.0.18
6
+
7
+ # Data
8
+ arxiv>=2.1,<3
9
+ requests>=2.31,<3
10
+ anthropic>=0.40,<1
11
+ feedparser>=6.0,<7
12
+
13
+ # Config
14
+ pyyaml>=6.0,<7
15
+
16
+ # Scheduling
17
+ apscheduler>=3.10,<4
scripts/backup-db.sh ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Daily SQLite backup — safe online backup using .backup command
3
+ # Add to crontab: 0 3 * * * /path/to/researcher/scripts/backup-db.sh
4
+
5
+ set -e
6
+
7
+ SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
8
+ PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
9
+
10
+ DB_PATH="${PROJECT_DIR}/data/researcher.db"
11
+ BACKUP_DIR="${PROJECT_DIR}/data/backups"
12
+ KEEP_DAYS=14
13
+
14
+ mkdir -p "$BACKUP_DIR"
15
+
16
+ TIMESTAMP=$(date +%Y%m%d-%H%M%S)
17
+ BACKUP_FILE="$BACKUP_DIR/researcher-$TIMESTAMP.db"
18
+
19
+ # Use SQLite online backup (safe with WAL mode)
20
+ python3 -c "
21
+ import sqlite3, shutil
22
+ src = sqlite3.connect('$DB_PATH')
23
+ dst = sqlite3.connect('$BACKUP_FILE')
24
+ src.backup(dst)
25
+ dst.close()
26
+ src.close()
27
+ "
28
+
29
+ # Compress
30
+ gzip "$BACKUP_FILE"
31
+ echo "Backup: ${BACKUP_FILE}.gz ($(du -h "${BACKUP_FILE}.gz" | cut -f1))"
32
+
33
+ # Prune old backups
34
+ find "$BACKUP_DIR" -name "researcher-*.db.gz" -mtime +"$KEEP_DAYS" -delete
35
+ echo "Pruned backups older than $KEEP_DAYS days"
src/__init__.py ADDED
File without changes
src/config.py ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Configuration loader — reads from config.yaml, falls back to defaults."""
2
+
3
+ import logging
4
+ import os
5
+ import re
6
+ import sys
7
+ from pathlib import Path
8
+
9
+ # ---------------------------------------------------------------------------
10
+ # Logging (always available, before config loads)
11
+ # ---------------------------------------------------------------------------
12
+
13
+ LOG_FORMAT = "%(asctime)s [%(name)s] %(levelname)s: %(message)s"
14
+ LOG_LEVEL = os.environ.get("LOG_LEVEL", "INFO").upper()
15
+
16
+ logging.basicConfig(
17
+ format=LOG_FORMAT,
18
+ level=getattr(logging, LOG_LEVEL, logging.INFO),
19
+ stream=sys.stdout,
20
+ )
21
+
22
+ # Quiet noisy libraries
23
+ logging.getLogger("httpx").setLevel(logging.WARNING)
24
+ logging.getLogger("httpcore").setLevel(logging.WARNING)
25
+ logging.getLogger("apscheduler").setLevel(logging.WARNING)
26
+
27
+ log = logging.getLogger(__name__)
28
+
29
+ # ---------------------------------------------------------------------------
30
+ # Config file path
31
+ # ---------------------------------------------------------------------------
32
+
33
+ CONFIG_PATH = Path(os.environ.get("CONFIG_PATH", "config.yaml"))
34
+ FIRST_RUN = not CONFIG_PATH.exists()
35
+
36
+ # ---------------------------------------------------------------------------
37
+ # Environment
38
+ # ---------------------------------------------------------------------------
39
+
40
+ ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", "")
41
+ GITHUB_TOKEN = os.environ.get("GITHUB_TOKEN", "")
42
+
43
+
44
+ def validate_env():
45
+ """Check required environment variables at startup. Warn on missing."""
46
+ if not ANTHROPIC_API_KEY:
47
+ log.warning("ANTHROPIC_API_KEY not set — scoring will be disabled")
48
+ if not GITHUB_TOKEN:
49
+ log.info("GITHUB_TOKEN not set — GitHub API calls will be rate-limited")
50
+
51
+
52
+ # ---------------------------------------------------------------------------
53
+ # Load config.yaml (or defaults)
54
+ # ---------------------------------------------------------------------------
55
+
56
+ def _load_yaml() -> dict:
57
+ """Load config.yaml if present, otherwise return empty dict."""
58
+ if CONFIG_PATH.exists():
59
+ try:
60
+ import yaml
61
+ with open(CONFIG_PATH) as f:
62
+ data = yaml.safe_load(f) or {}
63
+ log.info("Loaded config from %s", CONFIG_PATH)
64
+ return data
65
+ except Exception as e:
66
+ log.error("Failed to load %s: %s — using defaults", CONFIG_PATH, e)
67
+ return {}
68
+
69
+
70
+ _cfg = _load_yaml()
71
+
72
+ # ---------------------------------------------------------------------------
73
+ # Claude API
74
+ # ---------------------------------------------------------------------------
75
+
76
+ CLAUDE_MODEL = _cfg.get("claude_model", "claude-sonnet-4-5-20250929")
77
+ BATCH_SIZE = _cfg.get("batch_size", 20)
78
+
79
+ # ---------------------------------------------------------------------------
80
+ # Database
81
+ # ---------------------------------------------------------------------------
82
+
83
+ DB_PATH = Path(_cfg.get("database", {}).get("path", os.environ.get("DB_PATH", "data/researcher.db")))
84
+
85
+ # ---------------------------------------------------------------------------
86
+ # Web
87
+ # ---------------------------------------------------------------------------
88
+
89
+ WEB_HOST = _cfg.get("web", {}).get("host", "0.0.0.0")
90
+ WEB_PORT = _cfg.get("web", {}).get("port", 8888)
91
+
92
+ # ---------------------------------------------------------------------------
93
+ # Schedule
94
+ # ---------------------------------------------------------------------------
95
+
96
+ SCHEDULE_CRON = _cfg.get("schedule", {}).get("cron", "0 22 * * 0")
97
+
98
+ # ---------------------------------------------------------------------------
99
+ # Domains from config
100
+ # ---------------------------------------------------------------------------
101
+
102
+ _domains_cfg = _cfg.get("domains", {})
103
+
104
+ # ---------------------------------------------------------------------------
105
+ # Shared constants
106
+ # ---------------------------------------------------------------------------
107
+
108
+ HF_API = "https://huggingface.co/api"
109
+ GITHUB_URL_RE = re.compile(r"https?://github\.com/[A-Za-z0-9_.-]+/[A-Za-z0-9_.-]+")
110
+ MAX_ABSTRACT_CHARS_AIML = 2000
111
+ MAX_ABSTRACT_CHARS_SECURITY = 1500
112
+ HF_MAX_AGE_DAYS = 90
113
+
114
+ # ---------------------------------------------------------------------------
115
+ # AI/ML pipeline constants
116
+ # ---------------------------------------------------------------------------
117
+
118
+ _aiml_cfg = _domains_cfg.get("aiml", {})
119
+
120
+ ARXIV_LARGE_CATS = _aiml_cfg.get("arxiv_categories", ["cs.CV", "cs.CL", "cs.LG"])
121
+ ARXIV_SMALL_CATS = ["eess.AS", "cs.SD"]
122
+
123
+ _aiml_include = _aiml_cfg.get("include_patterns", [])
124
+ _aiml_exclude = _aiml_cfg.get("exclude_patterns", [])
125
+
126
+ _DEFAULT_INCLUDE = (
127
+ r"video.generat|world.model|image.generat|diffusion|text.to.image|text.to.video|"
128
+ r"code.generat|foundation.model|open.weight|large.language|language.model|"
129
+ r"text.to.speech|tts|speech.synth|voice.clon|audio.generat|"
130
+ r"transformer|attention.mechanism|state.space|mamba|mixture.of.expert|\bmoe\b|"
131
+ r"scaling.law|architecture|quantiz|distillat|pruning|"
132
+ r"multimodal|vision.language|\bvlm\b|agent|reasoning|"
133
+ r"reinforcement.learn|rlhf|dpo|preference.optim|"
134
+ r"retrieval.augment|\brag\b|in.context.learn|"
135
+ r"image.edit|video.edit|3d.generat|nerf|gaussian.splat|"
136
+ r"robot|embodied|simulat|"
137
+ r"benchmark|evaluat|leaderboard|"
138
+ r"open.source|reproducib|"
139
+ r"instruction.tun|fine.tun|align|"
140
+ r"long.context|context.window|"
141
+ r"token|vocab|embedding|"
142
+ r"training.efficien|parallel|distributed.train|"
143
+ r"synthetic.data|data.curat"
144
+ )
145
+
146
+ _DEFAULT_EXCLUDE = (
147
+ r"medical.imag|clinical|radiology|pathology|histolog|"
148
+ r"climate.model|weather.predict|meteorolog|"
149
+ r"survey.of|comprehensive.survey|"
150
+ r"sentiment.analysis|named.entity|"
151
+ r"drug.discover|protein.fold|molecular.dock|"
152
+ r"software.engineering.practice|code.smell|technical.debt|"
153
+ r"autonomous.driv|traffic.signal|"
154
+ r"remote.sens|satellite.imag|crop.yield|"
155
+ r"stock.predict|financial.forecast|"
156
+ r"electronic.health|patient.record|"
157
+ r"seismic|geophys|oceanograph|"
158
+ r"educational.data|student.perform|"
159
+ r"blockchain|smart.contract|\bdefi\b|decentralized.finance|cryptocurrency|"
160
+ r"jailbreak|guardrail|red.teaming|llm.safety|"
161
+ r"safe.alignment|safety.tuning|harmful.content|toxicity"
162
+ )
163
+
164
+ INCLUDE_RE = re.compile(
165
+ "|".join(_aiml_include) if _aiml_include else _DEFAULT_INCLUDE,
166
+ re.IGNORECASE,
167
+ )
168
+
169
+ EXCLUDE_RE = re.compile(
170
+ "|".join(_aiml_exclude) if _aiml_exclude else _DEFAULT_EXCLUDE,
171
+ re.IGNORECASE,
172
+ )
173
+
174
+ # ---------------------------------------------------------------------------
175
+ # Security pipeline constants
176
+ # ---------------------------------------------------------------------------
177
+
178
+ _sec_cfg = _domains_cfg.get("security", {})
179
+
180
+ SECURITY_KEYWORDS = re.compile(
181
+ r"\b(?:attack|vulnerability|exploit|fuzzing|fuzz|malware|"
182
+ r"intrusion|ransomware|phishing|adversarial|"
183
+ r"defense|defence|secure|security|privacy|"
184
+ r"cryptograph|authentication|authorization|"
185
+ r"injection|xss|csrf|cve\-\d|penetration.test|"
186
+ r"threat|anomaly.detect|ids\b|ips\b|firewall|"
187
+ r"reverse.engineer|obfuscat|sandbox|"
188
+ r"side.channel|buffer.overflow|zero.day|"
189
+ r"botnet|rootkit|trojan|worm)\b",
190
+ re.IGNORECASE,
191
+ )
192
+
193
+ ADJACENT_CATEGORIES = ["cs.AI", "cs.SE", "cs.NI", "cs.DC", "cs.OS", "cs.LG"]
194
+
195
+ SECURITY_EXCLUDE_RE = re.compile(
196
+ r"blockchain|smart.contract|\bdefi\b|decentralized.finance|"
197
+ r"memecoin|meme.coin|cryptocurrency.trading|\bnft\b|"
198
+ r"comprehensive.survey|systematization.of.knowledge|"
199
+ r"differential.privacy.(?:mechanism|framework)|"
200
+ r"stock.predict|financial.forecast|crop.yield|"
201
+ r"sentiment.analysis|educational.data",
202
+ re.IGNORECASE,
203
+ )
204
+
205
+ SECURITY_LLM_RE = re.compile(
206
+ r"jailbreak|guardrail|red.teaming|"
207
+ r"llm.safety|safe.alignment|safety.tuning|"
208
+ r"harmful.(?:content|output)|toxicity|content.moderation|"
209
+ r"prompt.injection|"
210
+ r"reward.model.(?:for|safety|alignment)",
211
+ re.IGNORECASE,
212
+ )
213
+
214
+ # ---------------------------------------------------------------------------
215
+ # Dynamic scoring prompt builder
216
+ # ---------------------------------------------------------------------------
217
+
218
+ def _build_scoring_prompt(domain: str, axes: list[dict], preferences: dict) -> str:
219
+ """Build a Claude scoring prompt from config axes + preferences."""
220
+ boost = preferences.get("boost_topics", [])
221
+ penalize = preferences.get("penalize_topics", [])
222
+
223
+ if domain == "aiml":
224
+ return _build_aiml_prompt(axes, boost, penalize)
225
+ elif domain == "security":
226
+ return _build_security_prompt(axes, boost, penalize)
227
+ return ""
228
+
229
+
230
+ def _build_aiml_prompt(axes: list[dict], boost: list[str], penalize: list[str]) -> str:
231
+ """Generate AI/ML scoring prompt from axes config."""
232
+ axis_fields = []
233
+ axis_section = []
234
+ for i, ax in enumerate(axes, 1):
235
+ name = ax.get("name", f"axis_{i}")
236
+ desc = ax.get("description", "")
237
+ field = name.lower().replace(" ", "_").replace("&", "and").replace("/", "_")
238
+ axis_fields.append(field)
239
+ axis_section.append(f"{i}. **{field}** — {name}: {desc}")
240
+
241
+ boost_line = ", ".join(boost) if boost else (
242
+ "New architectures, open-weight models, breakthrough methods, "
243
+ "papers with code AND weights, efficiency improvements"
244
+ )
245
+ penalize_line = ", ".join(penalize) if penalize else (
246
+ "Surveys, incremental SOTA, closed-model papers, "
247
+ "medical/climate/remote sensing applications"
248
+ )
249
+
250
+ return f"""\
251
+ You are an AI/ML research analyst. Score each paper on three axes (1-10):
252
+
253
+ {chr(10).join(axis_section)}
254
+
255
+ Scoring preferences:
256
+ - Score UP: {boost_line}
257
+ - Score DOWN: {penalize_line}
258
+
259
+ Use HF ecosystem signals: hf_upvotes > 50 means community interest; hf_models present = weights available;
260
+ hf_spaces = demo exists; github_repo = code available; source "both" = higher visibility.
261
+
262
+ Also provide:
263
+ - **summary**: 2-3 sentence practitioner-focused summary.
264
+ - **reasoning**: 1-2 sentences explaining scoring.
265
+ - **code_url**: Extract GitHub/GitLab URL from abstract/comments if present, else null.
266
+
267
+ Respond with a JSON array of objects, one per paper, each with fields:
268
+ arxiv_id, {", ".join(axis_fields)}, summary, reasoning, code_url
269
+ """
270
+
271
+
272
+ def _build_security_prompt(axes: list[dict], boost: list[str], penalize: list[str]) -> str:
273
+ """Generate security scoring prompt from axes config."""
274
+ axis_fields = []
275
+ axes_section = []
276
+ for i, ax in enumerate(axes, 1):
277
+ name = ax.get("name", f"axis_{i}")
278
+ desc = ax.get("description", "")
279
+ field = name.lower().replace(" ", "_").replace("&", "and").replace("/", "_")
280
+ axis_fields.append(field)
281
+ axes_section.append(f"{i}. **{field}** (1-10) — {name}: {desc}")
282
+
283
+ return f"""\
284
+ You are a security research analyst. Score each paper on three axes (1-10).
285
+
286
+ === HARD RULES (apply BEFORE scoring) ===
287
+
288
+ 1. If the paper is primarily about LLM safety, alignment, jailbreaking, guardrails,
289
+ red-teaming LLMs, or making AI models safer: cap ALL three axes at 3 max.
290
+ Check the "llm_adjacent" field — if true, this rule almost certainly applies.
291
+
292
+ 2. If the paper is a survey, SoK, or literature review: cap {axis_fields[1] if len(axis_fields) > 1 else 'axis_2'} at 2 max.
293
+
294
+ 3. If the paper is about blockchain, DeFi, cryptocurrency, smart contracts: cap ALL three axes at 2 max.
295
+
296
+ 4. If the paper is about theoretical differential privacy or federated learning
297
+ without concrete security attacks: cap ALL three axes at 3 max.
298
+
299
+ === SCORING AXES ===
300
+
301
+ {chr(10).join(axes_section)}
302
+
303
+ === OUTPUT ===
304
+
305
+ For each paper also provide:
306
+ - **summary**: 2-3 sentence practitioner-focused summary.
307
+ - **reasoning**: 1-2 sentences explaining your scoring.
308
+ - **code_url**: Extract GitHub/GitLab URL from abstract/comments if present, else null.
309
+
310
+ Respond with a JSON array of objects, one per paper, each with fields:
311
+ entry_id, {", ".join(axis_fields)}, summary, reasoning, code_url
312
+ """
313
+
314
+
315
+ # ---------------------------------------------------------------------------
316
+ # Scoring configs per domain
317
+ # ---------------------------------------------------------------------------
318
+
319
+ def _build_scoring_configs() -> dict:
320
+ """Build SCORING_CONFIGS from config.yaml or defaults."""
321
+ configs = {}
322
+
323
+ # AI/ML config
324
+ aiml_axes_cfg = _aiml_cfg.get("scoring_axes", [
325
+ {"name": "Code & Weights", "weight": 0.30, "description": "Open weights on HF, code on GitHub"},
326
+ {"name": "Novelty", "weight": 0.35, "description": "Paradigm shifts over incremental"},
327
+ {"name": "Practical Applicability", "weight": 0.35, "description": "Usable by practitioners soon"},
328
+ ])
329
+ aiml_prefs = _aiml_cfg.get("preferences", {})
330
+ aiml_weight_keys = ["code_weights", "novelty", "practical"]
331
+ aiml_weights = {}
332
+ for i, ax in enumerate(aiml_axes_cfg):
333
+ key = aiml_weight_keys[i] if i < len(aiml_weight_keys) else f"axis_{i+1}"
334
+ aiml_weights[key] = ax.get("weight", 1.0 / len(aiml_axes_cfg))
335
+
336
+ configs["aiml"] = {
337
+ "weights": aiml_weights,
338
+ "axes": ["code_weights", "novelty", "practical_applicability"],
339
+ "axis_labels": [ax.get("name", f"Axis {i+1}") for i, ax in enumerate(aiml_axes_cfg)],
340
+ "prompt": _build_scoring_prompt("aiml", aiml_axes_cfg, aiml_prefs),
341
+ }
342
+
343
+ # Security config
344
+ sec_axes_cfg = _sec_cfg.get("scoring_axes", [
345
+ {"name": "Has Code/PoC", "weight": 0.25, "description": "Working tools, repos, artifacts"},
346
+ {"name": "Novel Attack Surface", "weight": 0.40, "description": "First-of-kind research"},
347
+ {"name": "Real-World Impact", "weight": 0.35, "description": "Affects production systems"},
348
+ ])
349
+ sec_prefs = _sec_cfg.get("preferences", {})
350
+ sec_weight_keys = ["code", "novelty", "impact"]
351
+ sec_weights = {}
352
+ for i, ax in enumerate(sec_axes_cfg):
353
+ key = sec_weight_keys[i] if i < len(sec_weight_keys) else f"axis_{i+1}"
354
+ sec_weights[key] = ax.get("weight", 1.0 / len(sec_axes_cfg))
355
+
356
+ configs["security"] = {
357
+ "weights": sec_weights,
358
+ "axes": ["has_code", "novel_attack_surface", "real_world_impact"],
359
+ "axis_labels": [ax.get("name", f"Axis {i+1}") for i, ax in enumerate(sec_axes_cfg)],
360
+ "prompt": _build_scoring_prompt("security", sec_axes_cfg, sec_prefs),
361
+ }
362
+
363
+ return configs
364
+
365
+
366
+ SCORING_CONFIGS = _build_scoring_configs()
367
+
368
+ # ---------------------------------------------------------------------------
369
+ # Events config
370
+ # ---------------------------------------------------------------------------
371
+
372
+ RSS_FEEDS = _cfg.get("rss_feeds", [
373
+ {"name": "OpenAI Blog", "url": "https://openai.com/blog/rss.xml", "category": "news"},
374
+ {"name": "Anthropic Blog", "url": "https://www.anthropic.com/rss.xml", "category": "news"},
375
+ {"name": "Google DeepMind", "url": "https://deepmind.google/blog/rss.xml", "category": "news"},
376
+ {"name": "Meta AI", "url": "https://ai.meta.com/blog/rss/", "category": "news"},
377
+ {"name": "HuggingFace Blog", "url": "https://huggingface.co/blog/feed.xml", "category": "news"},
378
+ {"name": "Krebs on Security", "url": "https://krebsonsecurity.com/feed/", "category": "news"},
379
+ {"name": "The Record", "url": "https://therecord.media/feed", "category": "news"},
380
+ {"name": "Microsoft Security", "url": "https://www.microsoft.com/en-us/security/blog/feed/", "category": "news"},
381
+ ])
382
+
383
+ CONFERENCES = _cfg.get("conferences", [
384
+ {"name": "NeurIPS 2026", "url": "https://neurips.cc/", "domain": "aiml",
385
+ "deadline": "2026-05-16", "date": "2026-12-07",
386
+ "description": "Conference on Neural Information Processing Systems."},
387
+ {"name": "ICML 2026", "url": "https://icml.cc/", "domain": "aiml",
388
+ "deadline": "2026-01-23", "date": "2026-07-19",
389
+ "description": "International Conference on Machine Learning."},
390
+ {"name": "ICLR 2026", "url": "https://iclr.cc/", "domain": "aiml",
391
+ "deadline": "2025-10-01", "date": "2026-04-24",
392
+ "description": "International Conference on Learning Representations."},
393
+ {"name": "CVPR 2026", "url": "https://cvpr.thecvf.com/", "domain": "aiml",
394
+ "deadline": "2025-11-14", "date": "2026-06-15",
395
+ "description": "IEEE/CVF Conference on Computer Vision and Pattern Recognition."},
396
+ {"name": "ACL 2026", "url": "https://www.aclweb.org/", "domain": "aiml",
397
+ "deadline": "2026-02-20", "date": "2026-08-02",
398
+ "description": "Annual Meeting of the Association for Computational Linguistics."},
399
+ {"name": "IEEE S&P 2026", "url": "https://www.ieee-security.org/TC/SP/", "domain": "security",
400
+ "deadline": "2026-06-05", "date": "2026-05-18",
401
+ "description": "IEEE Symposium on Security and Privacy."},
402
+ {"name": "USENIX Security 2026", "url": "https://www.usenix.org/conference/usenixsecurity/", "domain": "security",
403
+ "deadline": "2026-02-04", "date": "2026-08-12",
404
+ "description": "USENIX Security Symposium."},
405
+ {"name": "CCS 2026", "url": "https://www.sigsac.org/ccs/", "domain": "security",
406
+ "deadline": "2026-05-01", "date": "2026-11-09",
407
+ "description": "ACM Conference on Computer and Communications Security."},
408
+ {"name": "Black Hat USA 2026", "url": "https://www.blackhat.com/", "domain": "security",
409
+ "deadline": "2026-04-01", "date": "2026-08-04",
410
+ "description": "Black Hat USA."},
411
+ {"name": "DEF CON 34", "url": "https://defcon.org/", "domain": "security",
412
+ "deadline": "2026-05-01", "date": "2026-08-06",
413
+ "description": "DEF CON hacker conference."},
414
+ ])
415
+
416
+ # ---------------------------------------------------------------------------
417
+ # GitHub projects (OSSInsight) config
418
+ # ---------------------------------------------------------------------------
419
+
420
+ OSSINSIGHT_API = "https://api.ossinsight.io/v1"
421
+
422
+ _github_cfg = _cfg.get("github", {})
423
+
424
+ OSSINSIGHT_COLLECTIONS = {}
425
+ for _coll in _github_cfg.get("collections", []):
426
+ if isinstance(_coll, dict):
427
+ OSSINSIGHT_COLLECTIONS[_coll["id"]] = (_coll["name"], _coll.get("domain", "aiml"))
428
+ elif isinstance(_coll, int):
429
+ OSSINSIGHT_COLLECTIONS[_coll] = (str(_coll), "aiml")
430
+
431
+ if not OSSINSIGHT_COLLECTIONS:
432
+ OSSINSIGHT_COLLECTIONS = {
433
+ 10010: ("Artificial Intelligence", "aiml"),
434
+ 10076: ("LLM Tools", "aiml"),
435
+ 10098: ("AI Agent Frameworks", "aiml"),
436
+ 10087: ("LLM DevTools", "aiml"),
437
+ 10079: ("Stable Diffusion Ecosystem", "aiml"),
438
+ 10075: ("ChatGPT Alternatives", "aiml"),
439
+ 10094: ("Vector Database", "aiml"),
440
+ 10095: ("GraphRAG", "aiml"),
441
+ 10099: ("MCP Client", "aiml"),
442
+ 10058: ("MLOps Tools", "aiml"),
443
+ 10051: ("Security Tool", "security"),
444
+ 10082: ("Web Scanner", "security"),
445
+ }
446
+
447
+ OSSINSIGHT_TRENDING_LANGUAGES = ["Python", "Rust", "Go", "TypeScript", "C++"]
448
+
449
+ GITHUB_AIML_KEYWORDS = re.compile(
450
+ r"machine.learn|deep.learn|neural.net|transformer|llm|large.language|"
451
+ r"diffusion|generat.ai|gpt|bert|llama|vision.model|multimodal|"
452
+ r"reinforcement.learn|computer.vision|nlp|natural.language|"
453
+ r"text.to|speech.to|image.generat|video.generat|"
454
+ r"fine.tun|training|inference|quantiz|embedding|vector|"
455
+ r"rag|retrieval.augment|agent|langchain|"
456
+ r"hugging.?face|pytorch|tensorflow|jax|"
457
+ r"stable.diffusion|comfyui|ollama|vllm|"
458
+ r"tokeniz|dataset|benchmark|model.serv|mlops",
459
+ re.IGNORECASE,
460
+ )
461
+
462
+ GITHUB_SECURITY_KEYWORDS = re.compile(
463
+ r"security|pentest|penetration.test|vulnerability|exploit|"
464
+ r"fuzzing|fuzz|malware|scanner|scanning|"
465
+ r"intrusion|ransomware|phishing|"
466
+ r"reverse.engineer|decompil|disassembl|"
467
+ r"ctf|capture.the.flag|"
468
+ r"firewall|ids\b|ips\b|siem|"
469
+ r"password|credential|auth|"
470
+ r"xss|csrf|injection|"
471
+ r"osint|reconnaissance|recon|"
472
+ r"forensic|incident.response|"
473
+ r"encryption|cryptograph|"
474
+ r"burp|nuclei|nmap|metasploit|wireshark",
475
+ re.IGNORECASE,
476
+ )
477
+
478
+ # ---------------------------------------------------------------------------
479
+ # Helpers
480
+ # ---------------------------------------------------------------------------
481
+
482
+ def get_enabled_domains() -> list[str]:
483
+ """Return list of enabled domain keys."""
484
+ if not _domains_cfg:
485
+ return ["aiml", "security"]
486
+ return [k for k, v in _domains_cfg.items() if v.get("enabled", True)]
487
+
488
+
489
+ def get_domain_label(domain: str) -> str:
490
+ """Return human-readable label for a domain."""
491
+ if _domains_cfg and domain in _domains_cfg:
492
+ return _domains_cfg[domain].get("label", domain.upper())
493
+ return {"aiml": "AI/ML", "security": "Security"}.get(domain, domain.upper())
494
+
495
+
496
+ def save_config(data: dict):
497
+ """Write config data to config.yaml."""
498
+ import yaml
499
+ with open(CONFIG_PATH, "w") as f:
500
+ yaml.dump(data, f, default_flow_style=False, sort_keys=False)
501
+ log.info("Config saved to %s", CONFIG_PATH)
502
+ global _cfg, FIRST_RUN, SCORING_CONFIGS
503
+ _cfg = data
504
+ FIRST_RUN = False
505
+ SCORING_CONFIGS.update(_build_scoring_configs())
src/db.py ADDED
@@ -0,0 +1,870 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Database layer — SQLite schema, connection, and query helpers."""
2
+
3
+ import json
4
+ import logging
5
+ import sqlite3
6
+ from contextlib import contextmanager
7
+ from datetime import datetime, timezone
8
+ from pathlib import Path
9
+
10
+ log = logging.getLogger(__name__)
11
+
12
+
13
+ def get_db_path() -> Path:
14
+ from src.config import DB_PATH
15
+ return DB_PATH
16
+
17
+
18
+ @contextmanager
19
+ def get_conn():
20
+ """Yield a SQLite connection with WAL mode and foreign keys."""
21
+ path = get_db_path()
22
+ path.parent.mkdir(parents=True, exist_ok=True)
23
+ conn = sqlite3.connect(str(path))
24
+ conn.row_factory = sqlite3.Row
25
+ conn.execute("PRAGMA journal_mode=WAL")
26
+ conn.execute("PRAGMA foreign_keys=ON")
27
+ try:
28
+ yield conn
29
+ conn.commit()
30
+ except Exception:
31
+ conn.rollback()
32
+ log.exception("Database transaction failed")
33
+ raise
34
+ finally:
35
+ conn.close()
36
+
37
+
38
+ def init_db():
39
+ """Create tables if they don't exist."""
40
+ with get_conn() as conn:
41
+ conn.executescript(SCHEMA)
42
+ for sql in _MIGRATIONS:
43
+ try:
44
+ conn.execute(sql)
45
+ except sqlite3.OperationalError as e:
46
+ if "duplicate column" in str(e).lower() or "already exists" in str(e).lower():
47
+ pass # Expected — column/index already exists
48
+ else:
49
+ log.warning("Migration failed: %s — %s", sql.strip()[:60], e)
50
+
51
+
52
+ SCHEMA = """\
53
+ CREATE TABLE IF NOT EXISTS runs (
54
+ id INTEGER PRIMARY KEY,
55
+ domain TEXT NOT NULL,
56
+ started_at TEXT NOT NULL,
57
+ finished_at TEXT,
58
+ date_start TEXT NOT NULL,
59
+ date_end TEXT NOT NULL,
60
+ paper_count INTEGER DEFAULT 0,
61
+ status TEXT DEFAULT 'running'
62
+ );
63
+
64
+ CREATE TABLE IF NOT EXISTS papers (
65
+ id INTEGER PRIMARY KEY,
66
+ run_id INTEGER REFERENCES runs(id),
67
+ domain TEXT NOT NULL,
68
+ arxiv_id TEXT NOT NULL,
69
+ entry_id TEXT,
70
+ title TEXT NOT NULL,
71
+ authors TEXT,
72
+ abstract TEXT,
73
+ published TEXT,
74
+ categories TEXT,
75
+ pdf_url TEXT,
76
+ arxiv_url TEXT,
77
+ comment TEXT,
78
+ source TEXT,
79
+ github_repo TEXT,
80
+ github_stars INTEGER,
81
+ hf_upvotes INTEGER DEFAULT 0,
82
+ hf_models TEXT,
83
+ hf_datasets TEXT,
84
+ hf_spaces TEXT,
85
+ score_axis_1 REAL,
86
+ score_axis_2 REAL,
87
+ score_axis_3 REAL,
88
+ composite REAL,
89
+ summary TEXT,
90
+ reasoning TEXT,
91
+ code_url TEXT,
92
+ UNIQUE(domain, arxiv_id, run_id)
93
+ );
94
+
95
+ CREATE TABLE IF NOT EXISTS events (
96
+ id INTEGER PRIMARY KEY,
97
+ run_id INTEGER,
98
+ category TEXT NOT NULL,
99
+ title TEXT NOT NULL,
100
+ description TEXT,
101
+ url TEXT,
102
+ event_date TEXT,
103
+ source TEXT,
104
+ relevance_score REAL,
105
+ fetched_at TEXT NOT NULL
106
+ );
107
+
108
+ CREATE TABLE IF NOT EXISTS paper_connections (
109
+ id INTEGER PRIMARY KEY,
110
+ paper_id INTEGER NOT NULL REFERENCES papers(id),
111
+ connected_arxiv_id TEXT,
112
+ connected_s2_id TEXT,
113
+ connected_title TEXT,
114
+ connected_year INTEGER,
115
+ connection_type TEXT NOT NULL,
116
+ in_db_paper_id INTEGER,
117
+ fetched_at TEXT NOT NULL
118
+ );
119
+
120
+ CREATE INDEX IF NOT EXISTS idx_papers_domain_composite
121
+ ON papers(domain, composite DESC);
122
+ CREATE INDEX IF NOT EXISTS idx_papers_run ON papers(run_id);
123
+ CREATE INDEX IF NOT EXISTS idx_events_category ON events(category, event_date);
124
+ CREATE INDEX IF NOT EXISTS idx_connections_paper ON paper_connections(paper_id);
125
+ CREATE INDEX IF NOT EXISTS idx_connections_arxiv ON paper_connections(connected_arxiv_id);
126
+ CREATE INDEX IF NOT EXISTS idx_papers_arxiv_id ON papers(arxiv_id);
127
+ CREATE INDEX IF NOT EXISTS idx_papers_published ON papers(published);
128
+ CREATE INDEX IF NOT EXISTS idx_events_run_id ON events(run_id);
129
+
130
+ CREATE TABLE IF NOT EXISTS github_projects (
131
+ id INTEGER PRIMARY KEY,
132
+ run_id INTEGER REFERENCES runs(id),
133
+ repo_id INTEGER NOT NULL,
134
+ repo_name TEXT NOT NULL,
135
+ description TEXT,
136
+ language TEXT,
137
+ stars INTEGER DEFAULT 0,
138
+ forks INTEGER DEFAULT 0,
139
+ pull_requests INTEGER DEFAULT 0,
140
+ total_score REAL DEFAULT 0,
141
+ collection_names TEXT,
142
+ topics TEXT DEFAULT '[]',
143
+ url TEXT NOT NULL,
144
+ domain TEXT,
145
+ fetched_at TEXT NOT NULL,
146
+ UNIQUE(repo_name, run_id)
147
+ );
148
+
149
+ CREATE INDEX IF NOT EXISTS idx_gh_run ON github_projects(run_id);
150
+ CREATE INDEX IF NOT EXISTS idx_gh_domain ON github_projects(domain, total_score DESC);
151
+ CREATE INDEX IF NOT EXISTS idx_gh_repo ON github_projects(repo_name);
152
+
153
+ CREATE TABLE IF NOT EXISTS user_signals (
154
+ id INTEGER PRIMARY KEY,
155
+ paper_id INTEGER NOT NULL REFERENCES papers(id),
156
+ action TEXT NOT NULL CHECK(action IN ('save','view','upvote','downvote','dismiss')),
157
+ created_at TEXT NOT NULL,
158
+ metadata TEXT DEFAULT '{}'
159
+ );
160
+
161
+ CREATE UNIQUE INDEX IF NOT EXISTS idx_signals_paper_action
162
+ ON user_signals(paper_id, action) WHERE action != 'view';
163
+ CREATE INDEX IF NOT EXISTS idx_signals_created ON user_signals(created_at);
164
+ CREATE INDEX IF NOT EXISTS idx_signals_paper ON user_signals(paper_id);
165
+
166
+ CREATE TABLE IF NOT EXISTS user_preferences (
167
+ id INTEGER PRIMARY KEY,
168
+ pref_key TEXT NOT NULL UNIQUE,
169
+ pref_value REAL NOT NULL DEFAULT 0.0,
170
+ signal_count INTEGER NOT NULL DEFAULT 0,
171
+ updated_at TEXT NOT NULL
172
+ );
173
+
174
+ CREATE INDEX IF NOT EXISTS idx_prefs_key ON user_preferences(pref_key);
175
+ """
176
+
177
+ # Columns added after initial schema — idempotent via try/except
178
+ _MIGRATIONS = [
179
+ "ALTER TABLE papers ADD COLUMN s2_tldr TEXT",
180
+ "ALTER TABLE papers ADD COLUMN s2_paper_id TEXT",
181
+ "ALTER TABLE papers ADD COLUMN topics TEXT DEFAULT '[]'",
182
+ "CREATE UNIQUE INDEX IF NOT EXISTS idx_events_unique ON events(title, category)",
183
+ ]
184
+
185
+
186
+ # ---------------------------------------------------------------------------
187
+ # Run helpers
188
+ # ---------------------------------------------------------------------------
189
+
190
+ def create_run(domain: str, date_start: str, date_end: str) -> int:
191
+ """Insert a new pipeline run, return its ID."""
192
+ now = datetime.now(timezone.utc).isoformat()
193
+ with get_conn() as conn:
194
+ cur = conn.execute(
195
+ "INSERT INTO runs (domain, started_at, date_start, date_end, status) "
196
+ "VALUES (?, ?, ?, ?, 'running')",
197
+ (domain, now, date_start, date_end),
198
+ )
199
+ return cur.lastrowid
200
+
201
+
202
+ def finish_run(run_id: int, paper_count: int, status: str = "completed"):
203
+ now = datetime.now(timezone.utc).isoformat()
204
+ with get_conn() as conn:
205
+ conn.execute(
206
+ "UPDATE runs SET finished_at=?, paper_count=?, status=? WHERE id=?",
207
+ (now, paper_count, status, run_id),
208
+ )
209
+
210
+
211
+ def get_latest_run(domain: str) -> dict | None:
212
+ with get_conn() as conn:
213
+ row = conn.execute(
214
+ "SELECT * FROM runs WHERE domain=? ORDER BY id DESC LIMIT 1",
215
+ (domain,),
216
+ ).fetchone()
217
+ return dict(row) if row else None
218
+
219
+
220
+ def get_run(run_id: int) -> dict | None:
221
+ with get_conn() as conn:
222
+ row = conn.execute("SELECT * FROM runs WHERE id=?", (run_id,)).fetchone()
223
+ return dict(row) if row else None
224
+
225
+
226
+ # ---------------------------------------------------------------------------
227
+ # Paper helpers
228
+ # ---------------------------------------------------------------------------
229
+
230
+ def _serialize_json(val):
231
+ """JSON-encode lists/dicts for storage."""
232
+ if isinstance(val, (list, dict)):
233
+ return json.dumps(val)
234
+ return val
235
+
236
+
237
+ def insert_papers(papers: list[dict], run_id: int, domain: str):
238
+ """Bulk-insert papers into the DB."""
239
+ with get_conn() as conn:
240
+ for p in papers:
241
+ conn.execute(
242
+ """INSERT OR IGNORE INTO papers
243
+ (run_id, domain, arxiv_id, entry_id, title, authors, abstract,
244
+ published, categories, pdf_url, arxiv_url, comment, source,
245
+ github_repo, github_stars, hf_upvotes, hf_models, hf_datasets, hf_spaces)
246
+ VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)""",
247
+ (
248
+ run_id, domain,
249
+ p.get("arxiv_id", ""),
250
+ p.get("entry_id", ""),
251
+ p.get("title", ""),
252
+ _serialize_json(p.get("authors", [])),
253
+ p.get("abstract", ""),
254
+ p.get("published", ""),
255
+ _serialize_json(p.get("categories", [])),
256
+ p.get("pdf_url", ""),
257
+ p.get("arxiv_url", ""),
258
+ p.get("comment", ""),
259
+ p.get("source", ""),
260
+ p.get("github_repo", ""),
261
+ p.get("github_stars"),
262
+ p.get("hf_upvotes", 0),
263
+ _serialize_json(p.get("hf_models", [])),
264
+ _serialize_json(p.get("hf_datasets", [])),
265
+ _serialize_json(p.get("hf_spaces", [])),
266
+ ),
267
+ )
268
+
269
+
270
+ def update_paper_scores(paper_id: int, scores: dict):
271
+ """Update a paper's scores after Claude scoring."""
272
+ with get_conn() as conn:
273
+ conn.execute(
274
+ """UPDATE papers SET
275
+ score_axis_1=?, score_axis_2=?, score_axis_3=?,
276
+ composite=?, summary=?, reasoning=?, code_url=?
277
+ WHERE id=?""",
278
+ (
279
+ scores.get("score_axis_1"),
280
+ scores.get("score_axis_2"),
281
+ scores.get("score_axis_3"),
282
+ scores.get("composite"),
283
+ scores.get("summary", ""),
284
+ scores.get("reasoning", ""),
285
+ scores.get("code_url"),
286
+ paper_id,
287
+ ),
288
+ )
289
+
290
+
291
+ def get_unscored_papers(run_id: int) -> list[dict]:
292
+ """Get papers from a run that haven't been scored yet."""
293
+ with get_conn() as conn:
294
+ rows = conn.execute(
295
+ "SELECT * FROM papers WHERE run_id=? AND composite IS NULL",
296
+ (run_id,),
297
+ ).fetchall()
298
+ return [_deserialize_paper(row) for row in rows]
299
+
300
+
301
+ def get_top_papers(domain: str, run_id: int | None = None, limit: int = 20) -> list[dict]:
302
+ """Get top-scored papers for a domain, optionally from a specific run."""
303
+ with get_conn() as conn:
304
+ if run_id:
305
+ rows = conn.execute(
306
+ "SELECT * FROM papers WHERE domain=? AND run_id=? AND composite IS NOT NULL "
307
+ "ORDER BY composite DESC LIMIT ?",
308
+ (domain, run_id, limit),
309
+ ).fetchall()
310
+ else:
311
+ # Latest run
312
+ latest = get_latest_run(domain)
313
+ if not latest:
314
+ return []
315
+ rows = conn.execute(
316
+ "SELECT * FROM papers WHERE domain=? AND run_id=? AND composite IS NOT NULL "
317
+ "ORDER BY composite DESC LIMIT ?",
318
+ (domain, latest["id"], limit),
319
+ ).fetchall()
320
+ return [_deserialize_paper(row) for row in rows]
321
+
322
+
323
+ def get_paper(paper_id: int) -> dict | None:
324
+ with get_conn() as conn:
325
+ row = conn.execute("SELECT * FROM papers WHERE id=?", (paper_id,)).fetchone()
326
+ return _deserialize_paper(row) if row else None
327
+
328
+
329
+ SORT_OPTIONS = {
330
+ "score": "composite DESC",
331
+ "date": "published DESC",
332
+ "axis1": "score_axis_1 DESC",
333
+ "axis2": "score_axis_2 DESC",
334
+ "axis3": "score_axis_3 DESC",
335
+ "title": "title ASC",
336
+ }
337
+
338
+
339
+ def get_papers_page(domain: str, run_id: int | None = None,
340
+ offset: int = 0, limit: int = 50,
341
+ min_score: float | None = None,
342
+ has_code: bool | None = None,
343
+ search: str | None = None,
344
+ topic: str | None = None,
345
+ sort: str | None = None) -> tuple[list[dict], int]:
346
+ """Paginated, filterable paper list. Returns (papers, total_count)."""
347
+ with get_conn() as conn:
348
+ if not run_id:
349
+ latest = get_latest_run(domain)
350
+ if not latest:
351
+ return [], 0
352
+ run_id = latest["id"]
353
+
354
+ conditions = ["domain=?", "run_id=?", "composite IS NOT NULL"]
355
+ params: list = [domain, run_id]
356
+
357
+ if min_score is not None:
358
+ conditions.append("composite >= ?")
359
+ params.append(min_score)
360
+
361
+ if has_code:
362
+ conditions.append("(code_url IS NOT NULL AND code_url != '')")
363
+
364
+ if search:
365
+ conditions.append("(title LIKE ? OR abstract LIKE ?)")
366
+ params.extend([f"%{search}%", f"%{search}%"])
367
+
368
+ if topic:
369
+ conditions.append("topics LIKE ?")
370
+ params.append(f'%"{topic}"%')
371
+
372
+ where = " AND ".join(conditions)
373
+ order = SORT_OPTIONS.get(sort, "composite DESC")
374
+
375
+ total = conn.execute(
376
+ f"SELECT COUNT(*) FROM papers WHERE {where}", params
377
+ ).fetchone()[0]
378
+
379
+ rows = conn.execute(
380
+ f"SELECT * FROM papers WHERE {where} ORDER BY {order} LIMIT ? OFFSET ?",
381
+ params + [limit, offset],
382
+ ).fetchall()
383
+
384
+ return [_deserialize_paper(row) for row in rows], total
385
+
386
+
387
+ def count_papers(domain: str, run_id: int | None = None, scored_only: bool = False) -> int:
388
+ with get_conn() as conn:
389
+ if not run_id:
390
+ latest = get_latest_run(domain)
391
+ if not latest:
392
+ return 0
393
+ run_id = latest["id"]
394
+ sql = "SELECT COUNT(*) FROM papers WHERE domain=? AND run_id=?"
395
+ if scored_only:
396
+ sql += " AND composite IS NOT NULL"
397
+ row = conn.execute(sql, (domain, run_id)).fetchone()
398
+ return row[0] if row else 0
399
+
400
+
401
+ def _deserialize_paper(row) -> dict:
402
+ """Convert a sqlite3.Row to a dict, parsing JSON fields."""
403
+ d = dict(row)
404
+ for key in ("authors", "categories", "hf_models", "hf_datasets", "hf_spaces", "topics"):
405
+ val = d.get(key)
406
+ if isinstance(val, str):
407
+ try:
408
+ d[key] = json.loads(val)
409
+ except (json.JSONDecodeError, TypeError):
410
+ d[key] = []
411
+ return d
412
+
413
+
414
+ # ---------------------------------------------------------------------------
415
+ # Event helpers
416
+ # ---------------------------------------------------------------------------
417
+
418
+ def insert_events(events: list[dict], run_id: int | None = None):
419
+ now = datetime.now(timezone.utc).isoformat()
420
+ with get_conn() as conn:
421
+ for e in events:
422
+ conn.execute(
423
+ """INSERT OR IGNORE INTO events
424
+ (run_id, category, title, description, url, event_date,
425
+ source, relevance_score, fetched_at)
426
+ VALUES (?,?,?,?,?,?,?,?,?)""",
427
+ (
428
+ run_id,
429
+ e.get("category", ""),
430
+ e.get("title", ""),
431
+ e.get("description", ""),
432
+ e.get("url", ""),
433
+ e.get("event_date", ""),
434
+ e.get("source", ""),
435
+ e.get("relevance_score"),
436
+ now,
437
+ ),
438
+ )
439
+
440
+
441
+ def get_events(category: str | None = None, limit: int = 50) -> list[dict]:
442
+ with get_conn() as conn:
443
+ if category:
444
+ rows = conn.execute(
445
+ "SELECT * FROM events WHERE category=? ORDER BY event_date DESC LIMIT ?",
446
+ (category, limit),
447
+ ).fetchall()
448
+ else:
449
+ rows = conn.execute(
450
+ "SELECT * FROM events ORDER BY fetched_at DESC LIMIT ?",
451
+ (limit,),
452
+ ).fetchall()
453
+ return [dict(row) for row in rows]
454
+
455
+
456
+ def count_events() -> int:
457
+ with get_conn() as conn:
458
+ return conn.execute("SELECT COUNT(*) FROM events").fetchone()[0]
459
+
460
+
461
+ # ---------------------------------------------------------------------------
462
+ # Dashboard helpers
463
+ # ---------------------------------------------------------------------------
464
+
465
+ def get_all_runs(limit: int = 20) -> list[dict]:
466
+ with get_conn() as conn:
467
+ rows = conn.execute(
468
+ "SELECT * FROM runs ORDER BY id DESC LIMIT ?", (limit,)
469
+ ).fetchall()
470
+ return [dict(row) for row in rows]
471
+
472
+
473
+ # ---------------------------------------------------------------------------
474
+ # Paper connections (Semantic Scholar)
475
+ # ---------------------------------------------------------------------------
476
+
477
+ def insert_connections(connections: list[dict]):
478
+ """Bulk-insert paper connections."""
479
+ now = datetime.now(timezone.utc).isoformat()
480
+ with get_conn() as conn:
481
+ for c in connections:
482
+ conn.execute(
483
+ """INSERT INTO paper_connections
484
+ (paper_id, connected_arxiv_id, connected_s2_id,
485
+ connected_title, connected_year, connection_type,
486
+ in_db_paper_id, fetched_at)
487
+ VALUES (?,?,?,?,?,?,?,?)""",
488
+ (
489
+ c["paper_id"],
490
+ c.get("connected_arxiv_id", ""),
491
+ c.get("connected_s2_id", ""),
492
+ c.get("connected_title", ""),
493
+ c.get("connected_year"),
494
+ c["connection_type"],
495
+ c.get("in_db_paper_id"),
496
+ now,
497
+ ),
498
+ )
499
+
500
+
501
+ def get_paper_connections(paper_id: int) -> dict:
502
+ """Get connected papers grouped by type."""
503
+ with get_conn() as conn:
504
+ rows = conn.execute(
505
+ "SELECT * FROM paper_connections WHERE paper_id=? "
506
+ "ORDER BY connection_type, connected_year DESC",
507
+ (paper_id,),
508
+ ).fetchall()
509
+
510
+ result = {"references": [], "recommendations": []}
511
+ for row in rows:
512
+ d = dict(row)
513
+ ctype = d["connection_type"]
514
+ if ctype in result:
515
+ result[ctype].append(d)
516
+ return result
517
+
518
+
519
+ def clear_connections(paper_id: int):
520
+ """Remove existing connections for a paper (before re-enrichment)."""
521
+ with get_conn() as conn:
522
+ conn.execute("DELETE FROM paper_connections WHERE paper_id=?", (paper_id,))
523
+
524
+
525
+ def update_paper_s2(paper_id: int, s2_paper_id: str, s2_tldr: str):
526
+ """Update S2 metadata on a paper."""
527
+ with get_conn() as conn:
528
+ conn.execute(
529
+ "UPDATE papers SET s2_paper_id=?, s2_tldr=? WHERE id=?",
530
+ (s2_paper_id, s2_tldr, paper_id),
531
+ )
532
+
533
+
534
+ def update_paper_topics(paper_id: int, topics: list[str]):
535
+ """Update topic tags on a paper."""
536
+ with get_conn() as conn:
537
+ conn.execute(
538
+ "UPDATE papers SET topics=? WHERE id=?",
539
+ (json.dumps(topics), paper_id),
540
+ )
541
+
542
+
543
+ def get_arxiv_id_map(run_id: int) -> dict[str, int]:
544
+ """Return {arxiv_id: paper_db_id} for all papers in a run."""
545
+ with get_conn() as conn:
546
+ rows = conn.execute(
547
+ "SELECT id, arxiv_id FROM papers WHERE run_id=?", (run_id,)
548
+ ).fetchall()
549
+ return {row["arxiv_id"]: row["id"] for row in rows}
550
+
551
+
552
+ def get_available_topics(domain: str, run_id: int) -> list[str]:
553
+ """Get distinct topic tags used in a run."""
554
+ with get_conn() as conn:
555
+ rows = conn.execute(
556
+ "SELECT DISTINCT topics FROM papers "
557
+ "WHERE domain=? AND run_id=? AND topics IS NOT NULL AND topics != '[]'",
558
+ (domain, run_id),
559
+ ).fetchall()
560
+
561
+ all_topics: set[str] = set()
562
+ for row in rows:
563
+ try:
564
+ all_topics.update(json.loads(row["topics"]))
565
+ except (json.JSONDecodeError, TypeError):
566
+ pass
567
+ return sorted(all_topics)
568
+
569
+
570
+ # ---------------------------------------------------------------------------
571
+ # GitHub project helpers
572
+ # ---------------------------------------------------------------------------
573
+
574
+ def insert_github_projects(projects: list[dict], run_id: int):
575
+ """Bulk-insert GitHub projects into the DB."""
576
+ now = datetime.now(timezone.utc).isoformat()
577
+ with get_conn() as conn:
578
+ for p in projects:
579
+ conn.execute(
580
+ """INSERT OR IGNORE INTO github_projects
581
+ (run_id, repo_id, repo_name, description, language,
582
+ stars, forks, pull_requests, total_score,
583
+ collection_names, topics, url, domain, fetched_at)
584
+ VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?)""",
585
+ (
586
+ run_id,
587
+ p.get("repo_id", 0),
588
+ p.get("repo_name", ""),
589
+ p.get("description", ""),
590
+ p.get("language", ""),
591
+ p.get("stars", 0),
592
+ p.get("forks", 0),
593
+ p.get("pull_requests", 0),
594
+ p.get("total_score", 0),
595
+ p.get("collection_names", ""),
596
+ _serialize_json(p.get("topics", [])),
597
+ p.get("url", ""),
598
+ p.get("domain", ""),
599
+ now,
600
+ ),
601
+ )
602
+
603
+
604
+ GH_SORT_OPTIONS = {
605
+ "score": "total_score DESC",
606
+ "stars": "stars DESC",
607
+ "forks": "forks DESC",
608
+ "name": "repo_name ASC",
609
+ }
610
+
611
+
612
+ def get_github_projects_page(
613
+ run_id: int | None = None,
614
+ offset: int = 0,
615
+ limit: int = 50,
616
+ search: str | None = None,
617
+ language: str | None = None,
618
+ domain: str | None = None,
619
+ sort: str | None = None,
620
+ ) -> tuple[list[dict], int]:
621
+ """Paginated, filterable GitHub project list."""
622
+ with get_conn() as conn:
623
+ if not run_id:
624
+ latest = get_latest_run("github")
625
+ if not latest:
626
+ return [], 0
627
+ run_id = latest["id"]
628
+
629
+ conditions = ["run_id=?"]
630
+ params: list = [run_id]
631
+
632
+ if search:
633
+ conditions.append("(repo_name LIKE ? OR description LIKE ?)")
634
+ params.extend([f"%{search}%", f"%{search}%"])
635
+
636
+ if language:
637
+ conditions.append("language=?")
638
+ params.append(language)
639
+
640
+ if domain:
641
+ conditions.append("domain=?")
642
+ params.append(domain)
643
+
644
+ where = " AND ".join(conditions)
645
+ order = GH_SORT_OPTIONS.get(sort, "total_score DESC")
646
+
647
+ total = conn.execute(
648
+ f"SELECT COUNT(*) FROM github_projects WHERE {where}", params
649
+ ).fetchone()[0]
650
+
651
+ rows = conn.execute(
652
+ f"SELECT * FROM github_projects WHERE {where} ORDER BY {order} LIMIT ? OFFSET ?",
653
+ params + [limit, offset],
654
+ ).fetchall()
655
+
656
+ return [_deserialize_gh_project(row) for row in rows], total
657
+
658
+
659
+ def get_top_github_projects(run_id: int | None = None, limit: int = 10) -> list[dict]:
660
+ """Get top GitHub projects by score."""
661
+ with get_conn() as conn:
662
+ if not run_id:
663
+ latest = get_latest_run("github")
664
+ if not latest:
665
+ return []
666
+ run_id = latest["id"]
667
+ rows = conn.execute(
668
+ "SELECT * FROM github_projects WHERE run_id=? ORDER BY total_score DESC LIMIT ?",
669
+ (run_id, limit),
670
+ ).fetchall()
671
+ return [_deserialize_gh_project(row) for row in rows]
672
+
673
+
674
+ def count_github_projects(run_id: int | None = None) -> int:
675
+ with get_conn() as conn:
676
+ if not run_id:
677
+ latest = get_latest_run("github")
678
+ if not latest:
679
+ return 0
680
+ run_id = latest["id"]
681
+ return conn.execute(
682
+ "SELECT COUNT(*) FROM github_projects WHERE run_id=?", (run_id,)
683
+ ).fetchone()[0]
684
+
685
+
686
+ def get_github_languages(run_id: int) -> list[str]:
687
+ """Get distinct languages in a GitHub run."""
688
+ with get_conn() as conn:
689
+ rows = conn.execute(
690
+ "SELECT DISTINCT language FROM github_projects "
691
+ "WHERE run_id=? AND language IS NOT NULL AND language != '' "
692
+ "ORDER BY language",
693
+ (run_id,),
694
+ ).fetchall()
695
+ return [row["language"] for row in rows]
696
+
697
+
698
+ def _deserialize_gh_project(row) -> dict:
699
+ d = dict(row)
700
+ for key in ("topics",):
701
+ val = d.get(key)
702
+ if isinstance(val, str):
703
+ try:
704
+ d[key] = json.loads(val)
705
+ except (json.JSONDecodeError, TypeError):
706
+ d[key] = []
707
+ return d
708
+
709
+
710
+ # ---------------------------------------------------------------------------
711
+ # User signal helpers (preference learning)
712
+ # ---------------------------------------------------------------------------
713
+
714
+ def insert_signal(paper_id: int, action: str, metadata: dict | None = None) -> bool:
715
+ """Record a user signal. Returns True if inserted, False if duplicate.
716
+
717
+ Views are deduped by 5-minute window. Other actions use UNIQUE constraint.
718
+ """
719
+ now = datetime.now(timezone.utc).isoformat()
720
+ meta_json = json.dumps(metadata or {})
721
+ with get_conn() as conn:
722
+ if action == "view":
723
+ # Dedup views within 5-minute window
724
+ recent = conn.execute(
725
+ "SELECT 1 FROM user_signals "
726
+ "WHERE paper_id=? AND action='view' "
727
+ "AND created_at > datetime(?, '-5 minutes')",
728
+ (paper_id, now),
729
+ ).fetchone()
730
+ if recent:
731
+ return False
732
+ conn.execute(
733
+ "INSERT INTO user_signals (paper_id, action, created_at, metadata) "
734
+ "VALUES (?, ?, ?, ?)",
735
+ (paper_id, action, now, meta_json),
736
+ )
737
+ return True
738
+ else:
739
+ try:
740
+ conn.execute(
741
+ "INSERT INTO user_signals (paper_id, action, created_at, metadata) "
742
+ "VALUES (?, ?, ?, ?)",
743
+ (paper_id, action, now, meta_json),
744
+ )
745
+ return True
746
+ except sqlite3.IntegrityError:
747
+ return False
748
+
749
+
750
+ def delete_signal(paper_id: int, action: str) -> bool:
751
+ """Remove a signal (for toggling off). Returns True if deleted."""
752
+ with get_conn() as conn:
753
+ cur = conn.execute(
754
+ "DELETE FROM user_signals WHERE paper_id=? AND action=?",
755
+ (paper_id, action),
756
+ )
757
+ return cur.rowcount > 0
758
+
759
+
760
+ def get_paper_signal(paper_id: int) -> str | None:
761
+ """Return the user's latest non-view signal for a paper, or None."""
762
+ with get_conn() as conn:
763
+ row = conn.execute(
764
+ "SELECT action FROM user_signals "
765
+ "WHERE paper_id=? AND action != 'view' "
766
+ "ORDER BY created_at DESC LIMIT 1",
767
+ (paper_id,),
768
+ ).fetchone()
769
+ return row["action"] if row else None
770
+
771
+
772
+ def get_paper_signals_batch(paper_ids: list[int]) -> dict[int, str]:
773
+ """Batch fetch latest non-view signal per paper. Returns {paper_id: action}."""
774
+ if not paper_ids:
775
+ return {}
776
+ with get_conn() as conn:
777
+ placeholders = ",".join("?" for _ in paper_ids)
778
+ rows = conn.execute(
779
+ f"SELECT paper_id, action FROM user_signals "
780
+ f"WHERE paper_id IN ({placeholders}) AND action != 'view' "
781
+ f"ORDER BY created_at DESC",
782
+ paper_ids,
783
+ ).fetchall()
784
+ result: dict[int, str] = {}
785
+ for row in rows:
786
+ pid = row["paper_id"]
787
+ if pid not in result:
788
+ result[pid] = row["action"]
789
+ return result
790
+
791
+
792
+ def get_all_signals_with_papers() -> list[dict]:
793
+ """Join signals with paper data for preference computation."""
794
+ with get_conn() as conn:
795
+ rows = conn.execute(
796
+ """SELECT s.id as signal_id, s.paper_id, s.action, s.created_at,
797
+ p.title, p.categories, p.topics, p.authors, p.domain,
798
+ p.score_axis_1, p.score_axis_2, p.score_axis_3, p.composite
799
+ FROM user_signals s
800
+ JOIN papers p ON s.paper_id = p.id
801
+ ORDER BY s.created_at DESC"""
802
+ ).fetchall()
803
+ results = []
804
+ for row in rows:
805
+ d = dict(row)
806
+ for key in ("categories", "topics", "authors"):
807
+ val = d.get(key)
808
+ if isinstance(val, str):
809
+ try:
810
+ d[key] = json.loads(val)
811
+ except (json.JSONDecodeError, TypeError):
812
+ d[key] = []
813
+ results.append(d)
814
+ return results
815
+
816
+
817
+ def get_signal_counts() -> dict[str, int]:
818
+ """Summary stats: count per action type."""
819
+ with get_conn() as conn:
820
+ rows = conn.execute(
821
+ "SELECT action, COUNT(*) as cnt FROM user_signals GROUP BY action"
822
+ ).fetchall()
823
+ return {row["action"]: row["cnt"] for row in rows}
824
+
825
+
826
+ def save_preferences(prefs: dict[str, tuple[float, int]]):
827
+ """Bulk write preferences. prefs = {key: (value, signal_count)}."""
828
+ now = datetime.now(timezone.utc).isoformat()
829
+ with get_conn() as conn:
830
+ conn.execute("DELETE FROM user_preferences")
831
+ for key, (value, count) in prefs.items():
832
+ conn.execute(
833
+ "INSERT INTO user_preferences (pref_key, pref_value, signal_count, updated_at) "
834
+ "VALUES (?, ?, ?, ?)",
835
+ (key, value, count, now),
836
+ )
837
+
838
+
839
+ def load_preferences() -> dict[str, float]:
840
+ """Load preference profile. Returns {pref_key: pref_value}."""
841
+ with get_conn() as conn:
842
+ rows = conn.execute(
843
+ "SELECT pref_key, pref_value FROM user_preferences"
844
+ ).fetchall()
845
+ return {row["pref_key"]: row["pref_value"] for row in rows}
846
+
847
+
848
+ def get_preferences_detail() -> list[dict]:
849
+ """Load full preference details for the preferences page."""
850
+ with get_conn() as conn:
851
+ rows = conn.execute(
852
+ "SELECT * FROM user_preferences ORDER BY ABS(pref_value) DESC"
853
+ ).fetchall()
854
+ return [dict(row) for row in rows]
855
+
856
+
857
+ def get_preferences_updated_at() -> str | None:
858
+ """Return when preferences were last computed."""
859
+ with get_conn() as conn:
860
+ row = conn.execute(
861
+ "SELECT updated_at FROM user_preferences ORDER BY updated_at DESC LIMIT 1"
862
+ ).fetchone()
863
+ return row["updated_at"] if row else None
864
+
865
+
866
+ def clear_preferences():
867
+ """Reset all preferences and signals."""
868
+ with get_conn() as conn:
869
+ conn.execute("DELETE FROM user_preferences")
870
+ conn.execute("DELETE FROM user_signals")
src/pipelines/__init__.py ADDED
File without changes
src/pipelines/aiml.py ADDED
@@ -0,0 +1,327 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """AI/ML paper pipeline.
2
+
3
+ Fetches papers from HuggingFace Daily Papers + arXiv, enriches with
4
+ HF ecosystem metadata, and writes to the database.
5
+ """
6
+
7
+ import logging
8
+ import re
9
+ import time
10
+ from datetime import datetime, timedelta, timezone
11
+
12
+ import arxiv
13
+ import requests
14
+
15
+ from src.config import (
16
+ ARXIV_LARGE_CATS,
17
+ ARXIV_SMALL_CATS,
18
+ EXCLUDE_RE,
19
+ GITHUB_URL_RE,
20
+ HF_API,
21
+ HF_MAX_AGE_DAYS,
22
+ INCLUDE_RE,
23
+ MAX_ABSTRACT_CHARS_AIML,
24
+ )
25
+ from src.db import create_run, finish_run, insert_papers
26
+
27
+ log = logging.getLogger(__name__)
28
+
29
+
30
+ # ---------------------------------------------------------------------------
31
+ # HuggingFace API
32
+ # ---------------------------------------------------------------------------
33
+
34
+
35
+ def fetch_hf_daily(date_str: str) -> list[dict]:
36
+ """Fetch HF Daily Papers for a given date."""
37
+ url = f"{HF_API}/daily_papers?date={date_str}"
38
+ try:
39
+ resp = requests.get(url, timeout=30)
40
+ resp.raise_for_status()
41
+ return resp.json()
42
+ except (requests.RequestException, ValueError):
43
+ return []
44
+
45
+
46
+ def fetch_hf_trending(limit: int = 50) -> list[dict]:
47
+ """Fetch HF trending papers."""
48
+ url = f"{HF_API}/daily_papers?sort=trending&limit={limit}"
49
+ try:
50
+ resp = requests.get(url, timeout=30)
51
+ resp.raise_for_status()
52
+ return resp.json()
53
+ except (requests.RequestException, ValueError):
54
+ return []
55
+
56
+
57
+ def arxiv_id_to_date(arxiv_id: str) -> datetime | None:
58
+ """Extract approximate publication date from arXiv ID (YYMM.NNNNN)."""
59
+ match = re.match(r"(\d{2})(\d{2})\.\d+", arxiv_id)
60
+ if not match:
61
+ return None
62
+ year = 2000 + int(match.group(1))
63
+ month = int(match.group(2))
64
+ if not (1 <= month <= 12):
65
+ return None
66
+ return datetime(year, month, 1, tzinfo=timezone.utc)
67
+
68
+
69
+ def normalize_hf_paper(hf_entry: dict) -> dict | None:
70
+ """Convert an HF daily_papers entry to our normalized format.
71
+
72
+ Returns None if the paper is too old.
73
+ """
74
+ paper = hf_entry.get("paper", hf_entry)
75
+ arxiv_id = paper.get("id", "")
76
+
77
+ authors_raw = paper.get("authors", [])
78
+ authors = []
79
+ for a in authors_raw:
80
+ if isinstance(a, dict):
81
+ name = a.get("name", a.get("user", {}).get("fullname", ""))
82
+ if name:
83
+ authors.append(name)
84
+ elif isinstance(a, str):
85
+ authors.append(a)
86
+
87
+ github_repo = hf_entry.get("githubRepo") or paper.get("githubRepo") or ""
88
+
89
+ pub_date = arxiv_id_to_date(arxiv_id)
90
+ if pub_date and (datetime.now(timezone.utc) - pub_date).days > HF_MAX_AGE_DAYS:
91
+ return None
92
+
93
+ return {
94
+ "arxiv_id": arxiv_id,
95
+ "title": paper.get("title", "").replace("\n", " ").strip(),
96
+ "authors": authors[:10],
97
+ "abstract": paper.get("summary", paper.get("abstract", "")).replace("\n", " ").strip(),
98
+ "published": paper.get("publishedAt", paper.get("published", "")),
99
+ "categories": paper.get("categories", []),
100
+ "pdf_url": f"https://arxiv.org/pdf/{arxiv_id}" if arxiv_id else "",
101
+ "arxiv_url": f"https://arxiv.org/abs/{arxiv_id}" if arxiv_id else "",
102
+ "comment": "",
103
+ "source": "hf",
104
+ "hf_upvotes": hf_entry.get("paper", {}).get("upvotes", hf_entry.get("upvotes", 0)),
105
+ "github_repo": github_repo,
106
+ "github_stars": None,
107
+ "hf_models": [],
108
+ "hf_datasets": [],
109
+ "hf_spaces": [],
110
+ }
111
+
112
+
113
+ # ---------------------------------------------------------------------------
114
+ # arXiv fetching
115
+ # ---------------------------------------------------------------------------
116
+
117
+
118
+ def fetch_arxiv_category(
119
+ cat: str,
120
+ start: datetime,
121
+ end: datetime,
122
+ max_results: int,
123
+ filter_keywords: bool,
124
+ ) -> list[dict]:
125
+ """Fetch papers from a single arXiv category."""
126
+ client = arxiv.Client(page_size=200, delay_seconds=3.0, num_retries=3)
127
+
128
+ query = arxiv.Search(
129
+ query=f"cat:{cat}",
130
+ max_results=max_results,
131
+ sort_by=arxiv.SortCriterion.SubmittedDate,
132
+ sort_order=arxiv.SortOrder.Descending,
133
+ )
134
+
135
+ papers = []
136
+ for result in client.results(query):
137
+ pub = result.published.replace(tzinfo=timezone.utc)
138
+ if pub < start:
139
+ break
140
+ if pub > end:
141
+ continue
142
+
143
+ if filter_keywords:
144
+ text = f"{result.title} {result.summary}"
145
+ if not INCLUDE_RE.search(text):
146
+ continue
147
+ if EXCLUDE_RE.search(text):
148
+ continue
149
+
150
+ papers.append(_arxiv_result_to_dict(result))
151
+
152
+ return papers
153
+
154
+
155
+ def _arxiv_result_to_dict(result: arxiv.Result) -> dict:
156
+ """Convert an arxiv.Result to our normalized format."""
157
+ arxiv_id = result.entry_id.split("/abs/")[-1]
158
+ base_id = re.sub(r"v\d+$", "", arxiv_id)
159
+
160
+ github_urls = GITHUB_URL_RE.findall(f"{result.summary} {result.comment or ''}")
161
+ github_repo = github_urls[0].rstrip(".") if github_urls else ""
162
+
163
+ return {
164
+ "arxiv_id": base_id,
165
+ "title": result.title.replace("\n", " ").strip(),
166
+ "authors": [a.name for a in result.authors[:10]],
167
+ "abstract": result.summary.replace("\n", " ").strip(),
168
+ "published": result.published.isoformat(),
169
+ "categories": list(result.categories),
170
+ "pdf_url": result.pdf_url,
171
+ "arxiv_url": result.entry_id,
172
+ "comment": (result.comment or "").replace("\n", " ").strip(),
173
+ "source": "arxiv",
174
+ "hf_upvotes": 0,
175
+ "github_repo": github_repo,
176
+ "github_stars": None,
177
+ "hf_models": [],
178
+ "hf_datasets": [],
179
+ "hf_spaces": [],
180
+ }
181
+
182
+
183
+ # ---------------------------------------------------------------------------
184
+ # Enrichment
185
+ # ---------------------------------------------------------------------------
186
+
187
+
188
+ def enrich_paper(paper: dict) -> dict:
189
+ """Query HF API for linked models, datasets, and spaces."""
190
+ arxiv_id = paper["arxiv_id"]
191
+ if not arxiv_id:
192
+ return paper
193
+
194
+ base_id = re.sub(r"v\d+$", "", arxiv_id)
195
+
196
+ for resource, key, limit in [
197
+ ("models", "hf_models", 5),
198
+ ("datasets", "hf_datasets", 3),
199
+ ("spaces", "hf_spaces", 3),
200
+ ]:
201
+ url = f"{HF_API}/{resource}?filter=arxiv:{base_id}&limit={limit}&sort=likes"
202
+ try:
203
+ resp = requests.get(url, timeout=15)
204
+ if resp.ok:
205
+ items = resp.json()
206
+ paper[key] = [
207
+ {"id": item.get("id", item.get("_id", "")), "likes": item.get("likes", 0)}
208
+ for item in items
209
+ ]
210
+ except (requests.RequestException, ValueError):
211
+ pass
212
+
213
+ time.sleep(0.2)
214
+ return paper
215
+
216
+
217
+ # ---------------------------------------------------------------------------
218
+ # Merge
219
+ # ---------------------------------------------------------------------------
220
+
221
+
222
+ def merge_papers(hf_papers: list[dict], arxiv_papers: list[dict]) -> list[dict]:
223
+ """Deduplicate by arXiv ID. When both sources have a paper, merge."""
224
+ by_id: dict[str, dict] = {}
225
+
226
+ for p in arxiv_papers:
227
+ aid = re.sub(r"v\d+$", "", p["arxiv_id"])
228
+ if aid:
229
+ by_id[aid] = p
230
+
231
+ for p in hf_papers:
232
+ aid = re.sub(r"v\d+$", "", p["arxiv_id"])
233
+ if not aid:
234
+ continue
235
+ if aid in by_id:
236
+ existing = by_id[aid]
237
+ existing["source"] = "both"
238
+ existing["hf_upvotes"] = max(existing.get("hf_upvotes", 0), p.get("hf_upvotes", 0))
239
+ if p.get("github_repo") and not existing.get("github_repo"):
240
+ existing["github_repo"] = p["github_repo"]
241
+ if not existing.get("categories") and p.get("categories"):
242
+ existing["categories"] = p["categories"]
243
+ else:
244
+ by_id[aid] = p
245
+
246
+ return list(by_id.values())
247
+
248
+
249
+ # ---------------------------------------------------------------------------
250
+ # Pipeline entry point
251
+ # ---------------------------------------------------------------------------
252
+
253
+
254
+ def run_aiml_pipeline(
255
+ start: datetime | None = None,
256
+ end: datetime | None = None,
257
+ max_papers: int = 300,
258
+ skip_enrich: bool = False,
259
+ ) -> int:
260
+ """Run the full AI/ML pipeline. Returns the run ID."""
261
+ if end is None:
262
+ end = datetime.now(timezone.utc)
263
+ if start is None:
264
+ start = end - timedelta(days=7)
265
+
266
+ # Ensure timezone-aware
267
+ if start.tzinfo is None:
268
+ start = start.replace(tzinfo=timezone.utc)
269
+ if end.tzinfo is None:
270
+ end = end.replace(tzinfo=timezone.utc, hour=23, minute=59, second=59)
271
+
272
+ run_id = create_run("aiml", start.date().isoformat(), end.date().isoformat())
273
+ log.info("Run %d: %s to %s", run_id, start.date(), end.date())
274
+
275
+ try:
276
+ # Step 1: Fetch HF papers
277
+ log.info("Fetching HuggingFace Daily Papers ...")
278
+ hf_papers_raw = []
279
+ current = start
280
+ while current <= end:
281
+ date_str = current.strftime("%Y-%m-%d")
282
+ daily = fetch_hf_daily(date_str)
283
+ hf_papers_raw.extend(daily)
284
+ current += timedelta(days=1)
285
+
286
+ trending = fetch_hf_trending(limit=50)
287
+ hf_papers_raw.extend(trending)
288
+
289
+ hf_papers = [p for p in (normalize_hf_paper(e) for e in hf_papers_raw) if p is not None]
290
+ log.info("HF papers: %d", len(hf_papers))
291
+
292
+ # Step 2: Fetch arXiv papers
293
+ log.info("Fetching arXiv papers ...")
294
+ arxiv_papers = []
295
+ for cat in ARXIV_LARGE_CATS:
296
+ papers = fetch_arxiv_category(cat, start, end, max_papers, filter_keywords=True)
297
+ arxiv_papers.extend(papers)
298
+ log.info(" %s: %d papers (keyword-filtered)", cat, len(papers))
299
+
300
+ for cat in ARXIV_SMALL_CATS:
301
+ papers = fetch_arxiv_category(cat, start, end, max_papers, filter_keywords=False)
302
+ arxiv_papers.extend(papers)
303
+ log.info(" %s: %d papers", cat, len(papers))
304
+
305
+ # Step 3: Merge
306
+ all_papers = merge_papers(hf_papers, arxiv_papers)
307
+ log.info("Merged: %d unique papers", len(all_papers))
308
+
309
+ # Step 4: Enrich
310
+ if not skip_enrich:
311
+ log.info("Enriching with HF ecosystem links ...")
312
+ for i, paper in enumerate(all_papers):
313
+ all_papers[i] = enrich_paper(paper)
314
+ if (i + 1) % 25 == 0:
315
+ log.info(" Enriched %d/%d ...", i + 1, len(all_papers))
316
+ log.info("Enrichment complete")
317
+
318
+ # Step 5: Insert into DB
319
+ insert_papers(all_papers, run_id, "aiml")
320
+ finish_run(run_id, len(all_papers))
321
+ log.info("Done — %d papers inserted", len(all_papers))
322
+ return run_id
323
+
324
+ except Exception as e:
325
+ finish_run(run_id, 0, status="failed")
326
+ log.exception("Pipeline failed")
327
+ raise
src/pipelines/events.py ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Events pipeline — conferences, releases, and news.
2
+
3
+ Three sub-collectors:
4
+ 1. Conferences: curated list + aideadlin.es scrape
5
+ 2. Releases: HF trending models/spaces
6
+ 3. News: RSS feeds from key AI/security blogs
7
+ """
8
+
9
+ import logging
10
+ import time
11
+ from datetime import datetime, timezone
12
+
13
+ import feedparser
14
+ import requests
15
+
16
+ from src.config import CONFERENCES, HF_API, RSS_FEEDS
17
+ from src.db import insert_events
18
+
19
+ log = logging.getLogger(__name__)
20
+
21
+
22
+ def run_events_pipeline() -> int:
23
+ """Run all event sub-collectors. Returns total events collected."""
24
+ log.info("Starting events pipeline ...")
25
+ all_events = []
26
+
27
+ # 1. Conference deadlines
28
+ conf_events = fetch_conference_deadlines()
29
+ all_events.extend(conf_events)
30
+ log.info("Conferences: %d", len(conf_events))
31
+
32
+ # 2. HF trending releases
33
+ release_events = fetch_hf_releases()
34
+ all_events.extend(release_events)
35
+ log.info("Releases: %d", len(release_events))
36
+
37
+ # 3. RSS news
38
+ news_events = fetch_rss_news()
39
+ all_events.extend(news_events)
40
+ log.info("News: %d", len(news_events))
41
+
42
+ if all_events:
43
+ insert_events(all_events)
44
+
45
+ log.info("Done — %d total events", len(all_events))
46
+ return len(all_events)
47
+
48
+
49
+ # ---------------------------------------------------------------------------
50
+ # Conferences
51
+ # ---------------------------------------------------------------------------
52
+
53
+
54
+ def fetch_conference_deadlines() -> list[dict]:
55
+ """Return curated conference list as events + try aideadlin.es."""
56
+ events = []
57
+
58
+ # Static curated list
59
+ for conf in CONFERENCES:
60
+ deadline = conf.get("deadline", "")
61
+ conf_date = conf.get("date", "")
62
+ desc = conf.get("description", "")
63
+ if deadline and conf_date:
64
+ desc = f"{desc} Deadline: {deadline}. Conference: {conf_date}."
65
+ elif deadline:
66
+ desc = f"{desc} Deadline: {deadline}."
67
+ elif conf_date:
68
+ desc = f"{desc} Conference: {conf_date}."
69
+ events.append({
70
+ "category": "conference",
71
+ "title": conf["name"],
72
+ "description": desc,
73
+ "url": conf["url"],
74
+ "event_date": deadline or conf_date or "",
75
+ "source": "curated",
76
+ })
77
+
78
+ # Try aideadlin.es for dynamic deadlines
79
+ try:
80
+ resp = requests.get("https://aideadlin.es/ai-deadlines.json", timeout=15)
81
+ if resp.ok:
82
+ deadlines = resp.json()
83
+ for d in deadlines:
84
+ if d.get("deadline", "TBA") == "TBA":
85
+ continue
86
+ events.append({
87
+ "category": "conference",
88
+ "title": d.get("title", d.get("name", "")),
89
+ "description": d.get("full_name", ""),
90
+ "url": d.get("link", ""),
91
+ "event_date": d.get("deadline", ""),
92
+ "source": "aideadlin.es",
93
+ })
94
+ except (requests.RequestException, ValueError) as e:
95
+ log.warning("aideadlin.es fetch failed: %s", e)
96
+
97
+ return events
98
+
99
+
100
+ # ---------------------------------------------------------------------------
101
+ # HF/GitHub releases
102
+ # ---------------------------------------------------------------------------
103
+
104
+
105
+ def fetch_hf_releases() -> list[dict]:
106
+ """Fetch trending models and spaces from HuggingFace."""
107
+ events = []
108
+
109
+ # Trending models
110
+ try:
111
+ resp = requests.get(
112
+ f"{HF_API}/models",
113
+ params={"sort": "trending", "limit": 15},
114
+ timeout=15,
115
+ )
116
+ if resp.ok:
117
+ for model in resp.json():
118
+ events.append({
119
+ "category": "release",
120
+ "title": model.get("id", ""),
121
+ "description": f"Trending model — {model.get('likes', 0)} likes, "
122
+ f"{model.get('downloads', 0)} downloads",
123
+ "url": f"https://huggingface.co/{model.get('id', '')}",
124
+ "event_date": model.get("lastModified", ""),
125
+ "source": "huggingface",
126
+ "relevance_score": None,
127
+ })
128
+ except (requests.RequestException, ValueError):
129
+ pass
130
+
131
+ time.sleep(0.5)
132
+
133
+ # Trending spaces
134
+ try:
135
+ resp = requests.get(
136
+ f"{HF_API}/spaces",
137
+ params={"sort": "trending", "limit": 10},
138
+ timeout=15,
139
+ )
140
+ if resp.ok:
141
+ for space in resp.json():
142
+ events.append({
143
+ "category": "release",
144
+ "title": f"Space: {space.get('id', '')}",
145
+ "description": f"Trending space — {space.get('likes', 0)} likes",
146
+ "url": f"https://huggingface.co/spaces/{space.get('id', '')}",
147
+ "event_date": space.get("lastModified", ""),
148
+ "source": "huggingface",
149
+ "relevance_score": None,
150
+ })
151
+ except (requests.RequestException, ValueError):
152
+ pass
153
+
154
+ return events
155
+
156
+
157
+ # ---------------------------------------------------------------------------
158
+ # RSS news
159
+ # ---------------------------------------------------------------------------
160
+
161
+
162
+ def fetch_rss_news() -> list[dict]:
163
+ """Fetch recent entries from configured RSS feeds."""
164
+ events = []
165
+
166
+ for feed_config in RSS_FEEDS:
167
+ try:
168
+ feed = feedparser.parse(feed_config["url"])
169
+ for entry in feed.entries[:5]:
170
+ published = ""
171
+ if hasattr(entry, "published"):
172
+ published = entry.published
173
+ elif hasattr(entry, "updated"):
174
+ published = entry.updated
175
+
176
+ events.append({
177
+ "category": "news",
178
+ "title": entry.get("title", ""),
179
+ "description": _clean_html(entry.get("summary", ""))[:300],
180
+ "url": entry.get("link", ""),
181
+ "event_date": published,
182
+ "source": feed_config["name"],
183
+ "relevance_score": None,
184
+ })
185
+ except Exception as e:
186
+ log.warning("RSS fetch failed for %s: %s", feed_config['name'], e)
187
+ time.sleep(0.3)
188
+
189
+ return events
190
+
191
+
192
+ def _clean_html(text: str) -> str:
193
+ """Strip HTML tags from text."""
194
+ import re
195
+ clean = re.sub(r"<[^>]+>", "", text)
196
+ return clean.replace("\n", " ").strip()
src/pipelines/github.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """GitHub projects pipeline — discover trending repos via OSSInsight.io API.
2
+
3
+ Two strategies:
4
+ 1. Trending repos — weekly trending filtered by AI/ML and security keywords
5
+ 2. Collection rankings — curated collections ranked by star growth
6
+ """
7
+
8
+ import logging
9
+ import time
10
+ from datetime import datetime, timedelta, timezone
11
+
12
+ import requests
13
+
14
+ from src.config import (
15
+ GITHUB_AIML_KEYWORDS,
16
+ GITHUB_SECURITY_KEYWORDS,
17
+ OSSINSIGHT_API,
18
+ OSSINSIGHT_COLLECTIONS,
19
+ OSSINSIGHT_TRENDING_LANGUAGES,
20
+ )
21
+ from src.db import create_run, finish_run, insert_github_projects
22
+
23
+ log = logging.getLogger(__name__)
24
+
25
+ _SESSION = requests.Session()
26
+ _SESSION.headers["Accept"] = "application/json"
27
+
28
+
29
+ def _safe_int(val, default=0) -> int:
30
+ """Parse an int from a value that may be empty string or None."""
31
+ if not val and val != 0:
32
+ return default
33
+ try:
34
+ return int(val)
35
+ except (ValueError, TypeError):
36
+ return default
37
+
38
+
39
+ def _safe_float(val, default=0.0) -> float:
40
+ if not val and val != 0:
41
+ return default
42
+ try:
43
+ return float(val)
44
+ except (ValueError, TypeError):
45
+ return default
46
+
47
+
48
+ def _api_get(path: str, params: dict | None = None) -> list[dict]:
49
+ """Make an OSSInsight API request and return the rows."""
50
+ url = f"{OSSINSIGHT_API}{path}"
51
+ try:
52
+ resp = _SESSION.get(url, params=params, timeout=30)
53
+ resp.raise_for_status()
54
+ data = resp.json().get("data", {})
55
+ return data.get("rows", [])
56
+ except (requests.RequestException, ValueError, KeyError) as e:
57
+ log.warning("OSSInsight API error for %s: %s", path, e)
58
+ return []
59
+
60
+
61
+ def _classify_domain(repo_name: str, description: str, collection_names: str = "") -> str | None:
62
+ """Classify a repo into aiml, security, or None based on keywords."""
63
+ text = f"{repo_name} {description} {collection_names}"
64
+ if GITHUB_SECURITY_KEYWORDS.search(text):
65
+ return "security"
66
+ if GITHUB_AIML_KEYWORDS.search(text):
67
+ return "aiml"
68
+ return None
69
+
70
+
71
+ def fetch_trending_repos() -> list[dict]:
72
+ """Fetch trending repos across configured languages for the past week."""
73
+ seen: set[str] = set()
74
+ projects: list[dict] = []
75
+
76
+ # Also fetch "All" to catch cross-language breakouts
77
+ languages = ["All"] + OSSINSIGHT_TRENDING_LANGUAGES
78
+
79
+ for lang in languages:
80
+ lang_param = lang if lang != "C++" else "C%2B%2B"
81
+ rows = _api_get("/trends/repos", {"language": lang_param, "period": "past_week"})
82
+ log.info("Trending %s: %d repos", lang, len(rows))
83
+
84
+ for row in rows:
85
+ repo_name = row.get("repo_name", "")
86
+ if not repo_name or repo_name in seen:
87
+ continue
88
+ seen.add(repo_name)
89
+
90
+ description = row.get("description", "") or ""
91
+ collection_names = row.get("collection_names", "") or ""
92
+ domain = _classify_domain(repo_name, description, collection_names)
93
+
94
+ if domain is None:
95
+ continue
96
+
97
+ projects.append({
98
+ "repo_id": _safe_int(row.get("repo_id")),
99
+ "repo_name": repo_name,
100
+ "description": description,
101
+ "language": row.get("primary_language", "") or "",
102
+ "stars": _safe_int(row.get("stars")),
103
+ "forks": _safe_int(row.get("forks")),
104
+ "pull_requests": _safe_int(row.get("pull_requests")),
105
+ "total_score": _safe_float(row.get("total_score")),
106
+ "collection_names": collection_names,
107
+ "topics": [],
108
+ "url": f"https://github.com/{repo_name}",
109
+ "domain": domain,
110
+ })
111
+
112
+ time.sleep(0.5)
113
+
114
+ return projects
115
+
116
+
117
+ def fetch_collection_rankings() -> list[dict]:
118
+ """Fetch top repos from curated AI/ML and security collections."""
119
+ seen: set[str] = set()
120
+ projects: list[dict] = []
121
+
122
+ for cid, (cname, domain) in OSSINSIGHT_COLLECTIONS.items():
123
+ rows = _api_get(f"/collections/{cid}/ranking_by_stars", {"period": "past_28_days"})
124
+ log.info("Collection '%s' (%d): %d repos", cname, cid, len(rows))
125
+
126
+ for row in rows:
127
+ repo_name = row.get("repo_name", "")
128
+ if not repo_name or repo_name in seen:
129
+ continue
130
+ seen.add(repo_name)
131
+
132
+ growth = _safe_int(row.get("current_period_growth"))
133
+ if growth <= 0:
134
+ continue
135
+
136
+ projects.append({
137
+ "repo_id": _safe_int(row.get("repo_id")),
138
+ "repo_name": repo_name,
139
+ "description": "",
140
+ "language": "",
141
+ "stars": growth,
142
+ "forks": 0,
143
+ "pull_requests": 0,
144
+ "total_score": _safe_float(growth),
145
+ "collection_names": cname,
146
+ "topics": [],
147
+ "url": f"https://github.com/{repo_name}",
148
+ "domain": domain,
149
+ })
150
+
151
+ time.sleep(0.5)
152
+
153
+ return projects
154
+
155
+
156
+ def run_github_pipeline() -> int:
157
+ """Run the full GitHub projects pipeline. Returns run_id."""
158
+ now = datetime.now(timezone.utc)
159
+ start = (now - timedelta(days=7)).date().isoformat()
160
+ end = now.date().isoformat()
161
+
162
+ run_id = create_run("github", start, end)
163
+ log.info("GitHub pipeline started — run %d (%s to %s)", run_id, start, end)
164
+
165
+ try:
166
+ # Strategy 1: Trending repos
167
+ trending = fetch_trending_repos()
168
+ log.info("Trending repos (filtered): %d", len(trending))
169
+
170
+ # Strategy 2: Collection rankings
171
+ collections = fetch_collection_rankings()
172
+ log.info("Collection repos: %d", len(collections))
173
+
174
+ # Merge — trending takes priority (has richer data)
175
+ seen = {p["repo_name"] for p in trending}
176
+ merged = list(trending)
177
+ for p in collections:
178
+ if p["repo_name"] not in seen:
179
+ seen.add(p["repo_name"])
180
+ merged.append(p)
181
+
182
+ log.info("Total unique projects: %d", len(merged))
183
+
184
+ if merged:
185
+ insert_github_projects(merged, run_id)
186
+
187
+ finish_run(run_id, len(merged))
188
+ log.info("GitHub pipeline complete — %d projects stored", len(merged))
189
+ return run_id
190
+
191
+ except Exception:
192
+ finish_run(run_id, 0, status="failed")
193
+ log.exception("GitHub pipeline failed")
194
+ raise
src/pipelines/security.py ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Security paper pipeline.
2
+
3
+ Fetches security papers from arXiv (cs.CR + adjacent categories),
4
+ finds code URLs, and writes to the database.
5
+ """
6
+
7
+ import logging
8
+ import re
9
+ import time
10
+ from datetime import datetime, timedelta, timezone
11
+
12
+ import arxiv
13
+ import requests
14
+
15
+ from src.config import (
16
+ ADJACENT_CATEGORIES,
17
+ GITHUB_TOKEN,
18
+ GITHUB_URL_RE,
19
+ SECURITY_EXCLUDE_RE,
20
+ SECURITY_KEYWORDS,
21
+ SECURITY_LLM_RE,
22
+ )
23
+ from src.db import create_run, finish_run, insert_papers
24
+
25
+ log = logging.getLogger(__name__)
26
+
27
+
28
+ # ---------------------------------------------------------------------------
29
+ # arXiv fetching
30
+ # ---------------------------------------------------------------------------
31
+
32
+
33
+ def fetch_arxiv_papers(start: datetime, end: datetime, max_papers: int) -> list[dict]:
34
+ """Fetch papers from arXiv: all cs.CR + security-filtered adjacent categories."""
35
+ client = arxiv.Client(page_size=500, delay_seconds=3.0, num_retries=3)
36
+ papers: dict[str, dict] = {}
37
+
38
+ # Primary: all cs.CR papers
39
+ log.info("Fetching cs.CR papers ...")
40
+ cr_query = arxiv.Search(
41
+ query="cat:cs.CR",
42
+ max_results=max_papers,
43
+ sort_by=arxiv.SortCriterion.SubmittedDate,
44
+ sort_order=arxiv.SortOrder.Descending,
45
+ )
46
+
47
+ for result in client.results(cr_query):
48
+ pub = result.published.replace(tzinfo=timezone.utc)
49
+ if pub < start:
50
+ break
51
+ if pub > end:
52
+ continue
53
+ paper = _result_to_dict(result)
54
+ papers[paper["entry_id"]] = paper
55
+
56
+ log.info("cs.CR: %d papers", len(papers))
57
+
58
+ # Adjacent categories with security keyword filter
59
+ for cat in ADJACENT_CATEGORIES:
60
+ adj_query = arxiv.Search(
61
+ query=f"cat:{cat}",
62
+ max_results=max_papers // len(ADJACENT_CATEGORIES),
63
+ sort_by=arxiv.SortCriterion.SubmittedDate,
64
+ sort_order=arxiv.SortOrder.Descending,
65
+ )
66
+ count = 0
67
+ for result in client.results(adj_query):
68
+ pub = result.published.replace(tzinfo=timezone.utc)
69
+ if pub < start:
70
+ break
71
+ if pub > end:
72
+ continue
73
+ text = f"{result.title} {result.summary}"
74
+ if SECURITY_KEYWORDS.search(text):
75
+ paper = _result_to_dict(result)
76
+ if paper["entry_id"] not in papers:
77
+ papers[paper["entry_id"]] = paper
78
+ count += 1
79
+ log.info(" %s: %d security-relevant papers", cat, count)
80
+
81
+ # Pre-filter: remove excluded topics (blockchain, surveys, etc.)
82
+ before = len(papers)
83
+ papers = {
84
+ eid: p for eid, p in papers.items()
85
+ if not SECURITY_EXCLUDE_RE.search(f"{p['title']} {p['abstract']}")
86
+ }
87
+ excluded = before - len(papers)
88
+ if excluded:
89
+ log.info("Excluded %d papers (blockchain/survey/off-topic)", excluded)
90
+
91
+ # Tag LLM-adjacent papers so the scoring prompt can apply hard caps
92
+ for p in papers.values():
93
+ text = f"{p['title']} {p['abstract']}"
94
+ p["llm_adjacent"] = bool(SECURITY_LLM_RE.search(text))
95
+
96
+ llm_count = sum(1 for p in papers.values() if p["llm_adjacent"])
97
+ if llm_count:
98
+ log.info("Tagged %d papers as LLM-adjacent", llm_count)
99
+
100
+ all_papers = list(papers.values())
101
+ log.info("Total unique papers: %d", len(all_papers))
102
+ return all_papers
103
+
104
+
105
+ def _result_to_dict(result: arxiv.Result) -> dict:
106
+ """Convert an arxiv.Result to a plain dict."""
107
+ arxiv_id = result.entry_id.split("/abs/")[-1]
108
+ base_id = re.sub(r"v\d+$", "", arxiv_id)
109
+
110
+ return {
111
+ "arxiv_id": base_id,
112
+ "entry_id": result.entry_id,
113
+ "title": result.title.replace("\n", " ").strip(),
114
+ "authors": [a.name for a in result.authors[:10]],
115
+ "abstract": result.summary.replace("\n", " ").strip(),
116
+ "published": result.published.isoformat(),
117
+ "categories": list(result.categories),
118
+ "pdf_url": result.pdf_url,
119
+ "arxiv_url": result.entry_id,
120
+ "comment": (result.comment or "").replace("\n", " ").strip(),
121
+ "source": "arxiv",
122
+ "github_repo": "",
123
+ "github_stars": None,
124
+ "hf_upvotes": 0,
125
+ "hf_models": [],
126
+ "hf_datasets": [],
127
+ "hf_spaces": [],
128
+ }
129
+
130
+
131
+ # ---------------------------------------------------------------------------
132
+ # Code URL finding
133
+ # ---------------------------------------------------------------------------
134
+
135
+
136
+ def extract_github_urls(paper: dict) -> list[str]:
137
+ """Extract GitHub URLs from abstract and comments."""
138
+ text = f"{paper['abstract']} {paper.get('comment', '')}"
139
+ return list(set(GITHUB_URL_RE.findall(text)))
140
+
141
+
142
+ def search_github_for_paper(title: str, token: str | None) -> str | None:
143
+ """Search GitHub for a repo matching the paper title."""
144
+ headers = {"Accept": "application/vnd.github.v3+json"}
145
+ if token:
146
+ headers["Authorization"] = f"token {token}"
147
+
148
+ if token:
149
+ try:
150
+ resp = requests.get("https://api.github.com/rate_limit", headers=headers, timeout=10)
151
+ if resp.ok:
152
+ remaining = resp.json().get("resources", {}).get("search", {}).get("remaining", 0)
153
+ if remaining < 5:
154
+ return None
155
+ except requests.RequestException:
156
+ pass
157
+
158
+ clean = re.sub(r"[^\w\s]", " ", title)
159
+ words = clean.split()[:8]
160
+ query = " ".join(words)
161
+
162
+ try:
163
+ resp = requests.get(
164
+ "https://api.github.com/search/repositories",
165
+ params={"q": query, "sort": "updated", "per_page": 3},
166
+ headers=headers,
167
+ timeout=10,
168
+ )
169
+ if not resp.ok:
170
+ return None
171
+ items = resp.json().get("items", [])
172
+ if items:
173
+ return items[0]["html_url"]
174
+ except requests.RequestException:
175
+ pass
176
+ return None
177
+
178
+
179
+ def find_code_urls(papers: list[dict]) -> dict[str, str | None]:
180
+ """Find code/repo URLs for each paper."""
181
+ token = GITHUB_TOKEN or None
182
+ code_urls: dict[str, str | None] = {}
183
+
184
+ for paper in papers:
185
+ urls = extract_github_urls(paper)
186
+ if urls:
187
+ code_urls[paper["entry_id"]] = urls[0]
188
+ continue
189
+
190
+ url = search_github_for_paper(paper["title"], token)
191
+ code_urls[paper["entry_id"]] = url
192
+ if not token:
193
+ time.sleep(2)
194
+
195
+ return code_urls
196
+
197
+
198
+ # ---------------------------------------------------------------------------
199
+ # Pipeline entry point
200
+ # ---------------------------------------------------------------------------
201
+
202
+
203
+ def run_security_pipeline(
204
+ start: datetime | None = None,
205
+ end: datetime | None = None,
206
+ max_papers: int = 300,
207
+ ) -> int:
208
+ """Run the full security pipeline. Returns the run ID."""
209
+ if end is None:
210
+ end = datetime.now(timezone.utc)
211
+ if start is None:
212
+ start = end - timedelta(days=7)
213
+
214
+ if start.tzinfo is None:
215
+ start = start.replace(tzinfo=timezone.utc)
216
+ if end.tzinfo is None:
217
+ end = end.replace(tzinfo=timezone.utc, hour=23, minute=59, second=59)
218
+
219
+ run_id = create_run("security", start.date().isoformat(), end.date().isoformat())
220
+ log.info("Run %d: %s to %s", run_id, start.date(), end.date())
221
+
222
+ try:
223
+ # Step 1: Fetch papers
224
+ papers = fetch_arxiv_papers(start, end, max_papers)
225
+
226
+ if not papers:
227
+ log.info("No papers found")
228
+ finish_run(run_id, 0)
229
+ return run_id
230
+
231
+ # Step 2: Find code URLs
232
+ log.info("Searching for code repositories ...")
233
+ code_urls = find_code_urls(papers)
234
+ with_code = sum(1 for v in code_urls.values() if v)
235
+ log.info("Found code for %d/%d papers", with_code, len(papers))
236
+
237
+ # Attach code URLs to papers as github_repo
238
+ for paper in papers:
239
+ url = code_urls.get(paper["entry_id"])
240
+ if url:
241
+ paper["github_repo"] = url
242
+
243
+ # Step 3: Insert into DB
244
+ insert_papers(papers, run_id, "security")
245
+ finish_run(run_id, len(papers))
246
+ log.info("Done — %d papers inserted", len(papers))
247
+ return run_id
248
+
249
+ except Exception as e:
250
+ finish_run(run_id, 0, status="failed")
251
+ log.exception("Pipeline failed")
252
+ raise
src/pipelines/semantic_scholar.py ADDED
@@ -0,0 +1,294 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Semantic Scholar enrichment — connected papers, TL;DR, and topic extraction.
2
+
3
+ Uses the free S2 Academic Graph API. No API key required but rate-limited
4
+ to a shared pool. With a key (x-api-key header), 1 req/sec guaranteed.
5
+
6
+ Enrichment strategy:
7
+ 1. Batch lookup all papers → TL;DR + S2 paper ID (1 API call per 500 papers)
8
+ 2. Top N papers by score → references + recommendations (2 calls each)
9
+ 3. Topic extraction from title/abstract (local, no API)
10
+ """
11
+
12
+ import json
13
+ import logging
14
+ import re
15
+ import time
16
+
17
+ import requests
18
+
19
+ log = logging.getLogger(__name__)
20
+
21
+ from src.db import (
22
+ clear_connections,
23
+ get_arxiv_id_map,
24
+ get_conn,
25
+ get_top_papers,
26
+ insert_connections,
27
+ update_paper_s2,
28
+ update_paper_topics,
29
+ )
30
+
31
+ S2_GRAPH = "https://api.semanticscholar.org/graph/v1"
32
+ S2_RECO = "https://api.semanticscholar.org/recommendations/v1"
33
+ S2_HEADERS: dict[str, str] = {} # Add {"x-api-key": "..."} if you have one
34
+
35
+ # How many top papers get full connection enrichment
36
+ TOP_N_CONNECTIONS = 30
37
+ # Rate limit pause between requests (seconds)
38
+ RATE_LIMIT = 1.1
39
+
40
+
41
+ # ---------------------------------------------------------------------------
42
+ # Main entry point
43
+ # ---------------------------------------------------------------------------
44
+
45
+
46
+ def enrich_run(run_id: int, domain: str):
47
+ """Enrich all scored papers in a run with S2 data + topics."""
48
+ with get_conn() as conn:
49
+ rows = conn.execute(
50
+ "SELECT id, arxiv_id, title, abstract, composite FROM papers "
51
+ "WHERE run_id=? AND composite IS NOT NULL "
52
+ "ORDER BY composite DESC",
53
+ (run_id,),
54
+ ).fetchall()
55
+ papers = [dict(r) for r in rows]
56
+
57
+ if not papers:
58
+ log.info("No scored papers in run %d, skipping", run_id)
59
+ return
60
+
61
+ arxiv_map = get_arxiv_id_map(run_id)
62
+ log.info("Enriching %d papers from run %d (%s)...", len(papers), run_id, domain)
63
+
64
+ # Step 1: Batch TL;DR + S2 ID
65
+ _batch_tldr(papers)
66
+
67
+ # Step 2: Connected papers for top N
68
+ top_papers = papers[:TOP_N_CONNECTIONS]
69
+ for i, p in enumerate(top_papers):
70
+ try:
71
+ _fetch_connections(p, arxiv_map)
72
+ except Exception as e:
73
+ log.warning("Error fetching connections for %s: %s", p['arxiv_id'], e)
74
+ if (i + 1) % 10 == 0:
75
+ log.info("Connections: %d/%d", i + 1, len(top_papers))
76
+
77
+ # Step 3: Topic extraction (local, instant)
78
+ for p in papers:
79
+ topics = extract_topics(p["title"], p.get("abstract", ""), domain)
80
+ if topics:
81
+ update_paper_topics(p["id"], topics)
82
+
83
+ log.info("Done enriching run %d", run_id)
84
+
85
+
86
+ # ---------------------------------------------------------------------------
87
+ # Step 1: Batch TL;DR
88
+ # ---------------------------------------------------------------------------
89
+
90
+
91
+ def _batch_tldr(papers: list[dict]):
92
+ """Batch fetch TL;DR and S2 paper IDs."""
93
+ chunk_size = 500
94
+ for start in range(0, len(papers), chunk_size):
95
+ chunk = papers[start : start + chunk_size]
96
+ ids = [f"arXiv:{p['arxiv_id']}" for p in chunk]
97
+
98
+ try:
99
+ resp = requests.post(
100
+ f"{S2_GRAPH}/paper/batch",
101
+ params={"fields": "externalIds,tldr"},
102
+ json={"ids": ids},
103
+ headers=S2_HEADERS,
104
+ timeout=30,
105
+ )
106
+ resp.raise_for_status()
107
+ results = resp.json()
108
+ except Exception as e:
109
+ log.warning("Batch TL;DR failed: %s", e)
110
+ time.sleep(RATE_LIMIT)
111
+ continue
112
+
113
+ for paper, s2_data in zip(chunk, results):
114
+ if s2_data is None:
115
+ continue
116
+ s2_id = s2_data.get("paperId", "")
117
+ tldr_obj = s2_data.get("tldr")
118
+ tldr_text = tldr_obj.get("text", "") if tldr_obj else ""
119
+ update_paper_s2(paper["id"], s2_id, tldr_text)
120
+ paper["s2_paper_id"] = s2_id
121
+
122
+ found = sum(1 for r in results if r is not None)
123
+ log.info("Batch TL;DR: %d/%d papers found in S2", found, len(chunk))
124
+ time.sleep(RATE_LIMIT)
125
+
126
+
127
+ # ---------------------------------------------------------------------------
128
+ # Step 2: Connected papers (references + recommendations)
129
+ # ---------------------------------------------------------------------------
130
+
131
+
132
+ def _fetch_connections(paper: dict, arxiv_map: dict[str, int]):
133
+ """Fetch references and recommendations for a single paper."""
134
+ arxiv_id = paper["arxiv_id"]
135
+ paper_id = paper["id"]
136
+
137
+ # Clear old connections before re-fetching
138
+ clear_connections(paper_id)
139
+
140
+ connections: list[dict] = []
141
+
142
+ # References
143
+ time.sleep(RATE_LIMIT)
144
+ try:
145
+ resp = requests.get(
146
+ f"{S2_GRAPH}/paper/arXiv:{arxiv_id}/references",
147
+ params={"fields": "title,year,externalIds", "limit": 30},
148
+ headers=S2_HEADERS,
149
+ timeout=15,
150
+ )
151
+ if resp.ok:
152
+ for item in resp.json().get("data", []):
153
+ cited = item.get("citedPaper")
154
+ if not cited or not cited.get("title"):
155
+ continue
156
+ ext = cited.get("externalIds") or {}
157
+ c_arxiv = ext.get("ArXiv", "")
158
+ connections.append({
159
+ "paper_id": paper_id,
160
+ "connected_arxiv_id": c_arxiv,
161
+ "connected_s2_id": cited.get("paperId", ""),
162
+ "connected_title": cited.get("title", ""),
163
+ "connected_year": cited.get("year"),
164
+ "connection_type": "reference",
165
+ "in_db_paper_id": arxiv_map.get(c_arxiv),
166
+ })
167
+ except requests.RequestException as e:
168
+ log.warning("References failed for %s: %s", arxiv_id, e)
169
+
170
+ # Recommendations
171
+ time.sleep(RATE_LIMIT)
172
+ try:
173
+ resp = requests.get(
174
+ f"{S2_RECO}/papers/forpaper/arXiv:{arxiv_id}",
175
+ params={"fields": "title,year,externalIds", "limit": 15},
176
+ headers=S2_HEADERS,
177
+ timeout=15,
178
+ )
179
+ if resp.ok:
180
+ for rec in resp.json().get("recommendedPapers", []):
181
+ if not rec or not rec.get("title"):
182
+ continue
183
+ ext = rec.get("externalIds") or {}
184
+ c_arxiv = ext.get("ArXiv", "")
185
+ connections.append({
186
+ "paper_id": paper_id,
187
+ "connected_arxiv_id": c_arxiv,
188
+ "connected_s2_id": rec.get("paperId", ""),
189
+ "connected_title": rec.get("title", ""),
190
+ "connected_year": rec.get("year"),
191
+ "connection_type": "recommendation",
192
+ "in_db_paper_id": arxiv_map.get(c_arxiv),
193
+ })
194
+ except requests.RequestException as e:
195
+ log.warning("Recommendations failed for %s: %s", arxiv_id, e)
196
+
197
+ if connections:
198
+ insert_connections(connections)
199
+
200
+
201
+ # ---------------------------------------------------------------------------
202
+ # Step 3: Topic extraction (local, no API)
203
+ # ---------------------------------------------------------------------------
204
+
205
+ AIML_TOPICS = {
206
+ "Video Generation": re.compile(
207
+ r"video.generat|text.to.video|video.diffusion|video.synth|video.edit", re.I),
208
+ "Image Generation": re.compile(
209
+ r"image.generat|text.to.image|(?:stable|latent).diffusion|image.synth|image.edit", re.I),
210
+ "Language Models": re.compile(
211
+ r"language.model|(?:large|foundation).model|\bllm\b|\bgpt\b|instruction.tun|fine.tun", re.I),
212
+ "Code": re.compile(
213
+ r"code.generat|code.complet|program.synth|vibe.cod|software.engineer", re.I),
214
+ "Multimodal": re.compile(
215
+ r"multimodal|vision.language|\bvlm\b|visual.question|image.text", re.I),
216
+ "Efficiency": re.compile(
217
+ r"quantiz|distillat|pruning|efficient|scaling.law|compress|accelerat", re.I),
218
+ "Agents": re.compile(
219
+ r"\bagent\b|tool.use|function.call|planning|agentic", re.I),
220
+ "Speech / Audio": re.compile(
221
+ r"text.to.speech|\btts\b|speech|audio.generat|voice|music.generat", re.I),
222
+ "3D / Vision": re.compile(
223
+ r"\b3d\b|nerf|gaussian.splat|point.cloud|depth.estim|object.detect|segmentat", re.I),
224
+ "Retrieval / RAG": re.compile(
225
+ r"retriev|\brag\b|knowledge.(?:base|graph)|in.context.learn|embedding", re.I),
226
+ "Robotics": re.compile(
227
+ r"robot|embodied|manipulat|locomotion|navigation", re.I),
228
+ "Reasoning": re.compile(
229
+ r"reasoning|chain.of.thought|mathemat|logic|theorem", re.I),
230
+ "Training": re.compile(
231
+ r"reinforcement.learn|\brlhf\b|\bdpo\b|preference|reward.model|alignment", re.I),
232
+ "Architecture": re.compile(
233
+ r"attention.mechanism|state.space|\bmamba\b|mixture.of.expert|\bmoe\b|transformer", re.I),
234
+ "Benchmark": re.compile(
235
+ r"benchmark|evaluat|leaderboard|dataset|scaling.law", re.I),
236
+ "World Models": re.compile(
237
+ r"world.model|environment.model|predictive.model|dynamics.model", re.I),
238
+ "Optimization": re.compile(
239
+ r"optimi[zs]|gradient|convergence|learning.rate|loss.function|multi.objective|adversarial.train", re.I),
240
+ "RL": re.compile(
241
+ r"reinforcement.learn|\brl\b|reward|policy.gradient|q.learning|bandit", re.I),
242
+ }
243
+
244
+ SECURITY_TOPICS = {
245
+ "Web Security": re.compile(
246
+ r"web.(?:secur|app|vuln)|xss|injection|csrf|waf|\bbrowser.secur", re.I),
247
+ "Network": re.compile(
248
+ r"network.secur|intrusion|\bids\b|firewall|traffic|\bdns\b|\bbgp\b|\bddos\b|fingerprint|scanning|packet", re.I),
249
+ "Malware": re.compile(
250
+ r"malware|ransomware|trojan|botnet|rootkit|worm|backdoor", re.I),
251
+ "Vulnerabilities": re.compile(
252
+ r"vulnerab|\bcve\b|exploit|fuzzing|fuzz|buffer.overflow|zero.day|attack.surface|security.bench", re.I),
253
+ "Cryptography": re.compile(
254
+ r"cryptograph|encryption|decrypt|protocol|\btls\b|\bssl\b|cipher", re.I),
255
+ "Hardware": re.compile(
256
+ r"side.channel|timing.attack|spectre|meltdown|hardware|firmware|microarch|fault.inject|emfi|embedded.secur", re.I),
257
+ "Reverse Engineering": re.compile(
258
+ r"reverse.engineer|binary|decompil|obfuscat|disassembl", re.I),
259
+ "Mobile": re.compile(
260
+ r"\bandroid\b|\bios.secur|mobile.secur", re.I),
261
+ "Cloud": re.compile(
262
+ r"cloud.secur|container.secur|docker|kubernetes|serverless|devsecops", re.I),
263
+ "Authentication": re.compile(
264
+ r"authentica|identity|credential|phishing|password|oauth|passkey|webauthn", re.I),
265
+ "Privacy": re.compile(
266
+ r"privacy|anonymi|differential.privacy|data.leak|tracking|membership.inference", re.I),
267
+ "LLM Security": re.compile(
268
+ r"(?:llm|language.model).*(secur|attack|jailbreak|safety|risk|unsafe|inject|adversar)|prompt.inject|red.team|rubric.attack|preference.drift", re.I),
269
+ "Forensics": re.compile(
270
+ r"forensic|incident.response|audit|log.analy|carver|tamper|evidence", re.I),
271
+ "Blockchain": re.compile(
272
+ r"blockchain|smart.contract|solana|ethereum|memecoin|mev|defi|token|cryptocurrency", re.I),
273
+ "Supply Chain": re.compile(
274
+ r"supply.chain|dependency|package.secur|software.comp|sbom", re.I),
275
+ }
276
+
277
+
278
+ def extract_topics(title: str, abstract: str, domain: str) -> list[str]:
279
+ """Extract up to 3 topic tags from title and abstract."""
280
+ patterns = AIML_TOPICS if domain == "aiml" else SECURITY_TOPICS
281
+ abstract_head = (abstract or "")[:500]
282
+
283
+ scored: dict[str, int] = {}
284
+ for topic, pattern in patterns.items():
285
+ score = 0
286
+ if pattern.search(title):
287
+ score += 3 # Title match is strong signal
288
+ if pattern.search(abstract_head):
289
+ score += 1
290
+ if score > 0:
291
+ scored[topic] = score
292
+
293
+ ranked = sorted(scored.items(), key=lambda x: -x[1])
294
+ return [t for t, _ in ranked[:3]]
src/preferences.py ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Preference engine — learns from user signals to personalize paper rankings.
2
+
3
+ Adds a preference_boost (max +1.5 / min -1.0) on top of stored composite scores.
4
+ Never re-scores papers. Papers with composite >= 8 are never penalized.
5
+ """
6
+
7
+ import logging
8
+ import math
9
+ import re
10
+ from collections import defaultdict
11
+ from datetime import datetime, timezone
12
+
13
+ from src.db import (
14
+ get_all_signals_with_papers,
15
+ load_preferences,
16
+ save_preferences,
17
+ get_paper_signal,
18
+ get_paper_signals_batch,
19
+ )
20
+
21
+ log = logging.getLogger(__name__)
22
+
23
+ # ---------------------------------------------------------------------------
24
+ # Signal weights
25
+ # ---------------------------------------------------------------------------
26
+
27
+ SIGNAL_WEIGHTS = {
28
+ "save": 3.0,
29
+ "upvote": 2.0,
30
+ "view": 0.5,
31
+ "downvote": -2.0,
32
+ "dismiss": -1.5,
33
+ }
34
+
35
+ HALF_LIFE_DAYS = 60.0
36
+
37
+ # Dimension weights for combining into final boost
38
+ DIMENSION_WEIGHTS = {
39
+ "topic": 0.35,
40
+ "axis": 0.25,
41
+ "keyword": 0.15,
42
+ "category": 0.15,
43
+ "author": 0.10,
44
+ }
45
+
46
+ # Scaling factors for tanh normalization (tuned per dimension)
47
+ SCALING_FACTORS = {
48
+ "topic": 5.0,
49
+ "axis": 4.0,
50
+ "keyword": 8.0,
51
+ "category": 5.0,
52
+ "author": 6.0,
53
+ }
54
+
55
+ # Stopwords for keyword extraction from titles
56
+ _STOPWORDS = frozenset(
57
+ "a an the and or but in on of for to with from by at is are was were "
58
+ "be been being have has had do does did will would shall should may might "
59
+ "can could this that these those it its we our their".split()
60
+ )
61
+
62
+ _WORD_RE = re.compile(r"[a-z]{3,}", re.IGNORECASE)
63
+
64
+
65
+ def _extract_keywords(title: str) -> list[str]:
66
+ """Extract meaningful keywords from a paper title."""
67
+ words = _WORD_RE.findall(title.lower())
68
+ return [w for w in words if w not in _STOPWORDS]
69
+
70
+
71
+ def _time_decay(created_at: str) -> float:
72
+ """Compute time decay factor: 2^(-age_days / half_life)."""
73
+ try:
74
+ signal_dt = datetime.fromisoformat(created_at.replace("Z", "+00:00"))
75
+ except (ValueError, AttributeError):
76
+ return 0.5
77
+ now = datetime.now(timezone.utc)
78
+ age_days = max(0, (now - signal_dt).total_seconds() / 86400)
79
+ return math.pow(2, -age_days / HALF_LIFE_DAYS)
80
+
81
+
82
+ # ---------------------------------------------------------------------------
83
+ # Preference computation
84
+ # ---------------------------------------------------------------------------
85
+
86
+ def compute_preferences() -> dict[str, float]:
87
+ """Compute user preference profile from all signals.
88
+
89
+ Returns the preference dict (also saved to DB).
90
+ """
91
+ signals = get_all_signals_with_papers()
92
+ if not signals:
93
+ save_preferences({})
94
+ return {}
95
+
96
+ # Accumulate raw scores per preference key
97
+ raw: dict[str, float] = defaultdict(float)
98
+ counts: dict[str, int] = defaultdict(int)
99
+
100
+ # For axis preferences: track domain means
101
+ axis_sums: dict[str, list[float]] = defaultdict(list)
102
+
103
+ for sig in signals:
104
+ base_weight = SIGNAL_WEIGHTS.get(sig["action"], 0)
105
+ decay = _time_decay(sig["created_at"])
106
+ weight = base_weight * decay
107
+
108
+ # Topics
109
+ topics = sig.get("topics") or []
110
+ if topics:
111
+ per_topic = weight / len(topics)
112
+ for t in topics:
113
+ key = f"topic:{t}"
114
+ raw[key] += per_topic
115
+ counts[key] += 1
116
+
117
+ # Categories
118
+ categories = sig.get("categories") or []
119
+ if categories:
120
+ per_cat = weight / len(categories)
121
+ for c in categories:
122
+ key = f"category:{c}"
123
+ raw[key] += per_cat
124
+ counts[key] += 1
125
+
126
+ # Keywords from title
127
+ keywords = _extract_keywords(sig.get("title", ""))
128
+ if keywords:
129
+ per_kw = weight / len(keywords)
130
+ for kw in keywords:
131
+ key = f"keyword:{kw}"
132
+ raw[key] += per_kw
133
+ counts[key] += 1
134
+
135
+ # Authors (first 3 only)
136
+ authors = sig.get("authors") or []
137
+ if isinstance(authors, str):
138
+ authors = [authors]
139
+ for author in authors[:3]:
140
+ name = author if isinstance(author, str) else str(author)
141
+ key = f"author:{name}"
142
+ raw[key] += weight * 0.5 # reduced weight for authors
143
+ counts[key] += 1
144
+
145
+ # Axis preferences (track which axes are high on liked papers)
146
+ domain = sig.get("domain", "")
147
+ for i in range(1, 4):
148
+ axis_val = sig.get(f"score_axis_{i}")
149
+ if axis_val is not None:
150
+ axis_sums[f"{domain}:axis{i}"].append(axis_val)
151
+
152
+ # Compute axis preferences relative to domain mean
153
+ for sig in signals:
154
+ base_weight = SIGNAL_WEIGHTS.get(sig["action"], 0)
155
+ if base_weight <= 0:
156
+ continue # Only positive signals inform axis preferences
157
+ decay = _time_decay(sig["created_at"])
158
+ weight = base_weight * decay
159
+ domain = sig.get("domain", "")
160
+
161
+ for i in range(1, 4):
162
+ axis_val = sig.get(f"score_axis_{i}")
163
+ mean_key = f"{domain}:axis{i}"
164
+ if axis_val is not None and axis_sums.get(mean_key):
165
+ mean = sum(axis_sums[mean_key]) / len(axis_sums[mean_key])
166
+ deviation = axis_val - mean
167
+ key = f"axis_pref:{domain}:axis{i}"
168
+ raw[key] += deviation * weight * 0.1
169
+ counts[key] += 1
170
+
171
+ # Normalize via tanh
172
+ prefs: dict[str, tuple[float, int]] = {}
173
+ for key, value in raw.items():
174
+ prefix = key.split(":")[0]
175
+ scale = SCALING_FACTORS.get(prefix, 5.0)
176
+ normalized = math.tanh(value / scale)
177
+ # Clamp to [-1, 1]
178
+ normalized = max(-1.0, min(1.0, normalized))
179
+ prefs[key] = (round(normalized, 4), counts[key])
180
+
181
+ save_preferences(prefs)
182
+ return {k: v for k, (v, _) in prefs.items()}
183
+
184
+
185
+ # ---------------------------------------------------------------------------
186
+ # Paper boost computation
187
+ # ---------------------------------------------------------------------------
188
+
189
+ def compute_paper_boost(paper: dict, preferences: dict[str, float]) -> tuple[float, list[str]]:
190
+ """Compute preference boost for a single paper.
191
+
192
+ Returns (boost_value, list_of_reasons).
193
+ Boost is clamped to [-1.0, +1.5].
194
+ Papers with composite >= 8 are never penalized (boost >= 0).
195
+ """
196
+ if not preferences:
197
+ return 0.0, []
198
+
199
+ scores: dict[str, float] = {}
200
+ reasons: list[str] = []
201
+
202
+ # Topic match
203
+ topics = paper.get("topics") or []
204
+ if topics:
205
+ topic_scores = []
206
+ for t in topics:
207
+ key = f"topic:{t}"
208
+ if key in preferences:
209
+ topic_scores.append((t, preferences[key]))
210
+ if topic_scores:
211
+ scores["topic"] = sum(v for _, v in topic_scores) / len(topic_scores)
212
+ for name, val in sorted(topic_scores, key=lambda x: abs(x[1]), reverse=True)[:2]:
213
+ if abs(val) > 0.05:
214
+ reasons.append(f"Topic: {name} {val:+.2f}")
215
+
216
+ # Category match
217
+ categories = paper.get("categories") or []
218
+ if categories:
219
+ cat_scores = []
220
+ for c in categories:
221
+ key = f"category:{c}"
222
+ if key in preferences:
223
+ cat_scores.append((c, preferences[key]))
224
+ if cat_scores:
225
+ scores["category"] = sum(v for _, v in cat_scores) / len(cat_scores)
226
+ for name, val in sorted(cat_scores, key=lambda x: abs(x[1]), reverse=True)[:1]:
227
+ if abs(val) > 0.05:
228
+ reasons.append(f"Category: {name} {val:+.2f}")
229
+
230
+ # Keyword match
231
+ keywords = _extract_keywords(paper.get("title", ""))
232
+ if keywords:
233
+ kw_scores = []
234
+ for kw in keywords:
235
+ key = f"keyword:{kw}"
236
+ if key in preferences:
237
+ kw_scores.append((kw, preferences[key]))
238
+ if kw_scores:
239
+ scores["keyword"] = sum(v for _, v in kw_scores) / len(kw_scores)
240
+ for name, val in sorted(kw_scores, key=lambda x: abs(x[1]), reverse=True)[:1]:
241
+ if abs(val) > 0.1:
242
+ reasons.append(f"Keyword: {name} {val:+.2f}")
243
+
244
+ # Axis alignment
245
+ domain = paper.get("domain", "")
246
+ axis_scores = []
247
+ for i in range(1, 4):
248
+ key = f"axis_pref:{domain}:axis{i}"
249
+ if key in preferences:
250
+ axis_val = paper.get(f"score_axis_{i}")
251
+ if axis_val is not None:
252
+ # Higher axis value * positive preference = boost
253
+ axis_scores.append(preferences[key] * (axis_val / 10.0))
254
+ if axis_scores:
255
+ scores["axis"] = sum(axis_scores) / len(axis_scores)
256
+
257
+ # Author match
258
+ authors = paper.get("authors") or []
259
+ if isinstance(authors, str):
260
+ authors = [authors]
261
+ author_scores = []
262
+ for author in authors[:5]:
263
+ name = author if isinstance(author, str) else str(author)
264
+ key = f"author:{name}"
265
+ if key in preferences:
266
+ author_scores.append((name.split()[-1] if " " in name else name, preferences[key]))
267
+ if author_scores:
268
+ scores["author"] = max(v for _, v in author_scores) # Best author match
269
+ for name, val in sorted(author_scores, key=lambda x: abs(x[1]), reverse=True)[:1]:
270
+ if abs(val) > 0.1:
271
+ reasons.append(f"Author: {name} {val:+.2f}")
272
+
273
+ # Weighted combine
274
+ if not scores:
275
+ return 0.0, []
276
+
277
+ boost = 0.0
278
+ total_weight = 0.0
279
+ for dim, dim_score in scores.items():
280
+ w = DIMENSION_WEIGHTS.get(dim, 0.1)
281
+ boost += dim_score * w
282
+ total_weight += w
283
+
284
+ if total_weight > 0:
285
+ boost = boost / total_weight # Normalize by actual weight used
286
+
287
+ # Scale to boost range: preferences are [-1, 1], we want [-1, 1.5]
288
+ boost = boost * 1.5
289
+
290
+ # Clamp
291
+ boost = max(-1.0, min(1.5, boost))
292
+
293
+ # Safety net: high-scoring papers never penalized
294
+ composite = paper.get("composite") or 0
295
+ if composite >= 8 and boost < 0:
296
+ boost = 0.0
297
+
298
+ return round(boost, 2), reasons
299
+
300
+
301
+ def is_discovery(paper: dict, boost: float) -> bool:
302
+ """Paper is 'discovery' if composite >= 6 AND boost <= 0."""
303
+ composite = paper.get("composite") or 0
304
+ return composite >= 6 and boost <= 0
305
+
306
+
307
+ def enrich_papers_with_preferences(
308
+ papers: list[dict],
309
+ preferences: dict[str, float] | None = None,
310
+ sort_adjusted: bool = False,
311
+ ) -> list[dict]:
312
+ """Add preference fields to each paper dict.
313
+
314
+ Adds: adjusted_score, preference_boost, boost_reasons, is_discovery, user_signal.
315
+ """
316
+ if preferences is None:
317
+ preferences = load_preferences()
318
+
319
+ # Batch fetch user signals
320
+ paper_ids = [p["id"] for p in papers if "id" in p]
321
+ signals_map = get_paper_signals_batch(paper_ids) if paper_ids else {}
322
+
323
+ has_prefs = bool(preferences)
324
+
325
+ for p in papers:
326
+ pid = p.get("id")
327
+ composite = p.get("composite") or 0
328
+
329
+ if has_prefs:
330
+ boost, reasons = compute_paper_boost(p, preferences)
331
+ else:
332
+ boost, reasons = 0.0, []
333
+
334
+ p["preference_boost"] = boost
335
+ p["adjusted_score"] = round(composite + boost, 2)
336
+ p["boost_reasons"] = reasons
337
+ p["is_discovery"] = is_discovery(p, boost) if has_prefs else False
338
+ p["user_signal"] = signals_map.get(pid)
339
+
340
+ if sort_adjusted and has_prefs:
341
+ papers.sort(key=lambda p: p.get("adjusted_score", 0), reverse=True)
342
+
343
+ return papers
src/scheduler.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """APScheduler — weekly pipeline trigger running inside the web process."""
2
+
3
+ import logging
4
+
5
+ from apscheduler.schedulers.background import BackgroundScheduler
6
+ from apscheduler.triggers.cron import CronTrigger
7
+
8
+ log = logging.getLogger(__name__)
9
+ scheduler = BackgroundScheduler()
10
+
11
+
12
+ def weekly_run():
13
+ """Run all pipelines: aiml → security → events → reports."""
14
+ log.info("Starting weekly run ...")
15
+
16
+ try:
17
+ from src.pipelines.aiml import run_aiml_pipeline
18
+ from src.scoring import score_run
19
+
20
+ aiml_run_id = run_aiml_pipeline()
21
+ score_run(aiml_run_id, "aiml")
22
+
23
+ from src.web.app import _generate_report
24
+ _generate_report(aiml_run_id, "aiml")
25
+ except Exception:
26
+ log.exception("AI/ML pipeline failed")
27
+
28
+ try:
29
+ from src.pipelines.security import run_security_pipeline
30
+ from src.scoring import score_run
31
+
32
+ sec_run_id = run_security_pipeline()
33
+ score_run(sec_run_id, "security")
34
+
35
+ from src.web.app import _generate_report
36
+ _generate_report(sec_run_id, "security")
37
+ except Exception:
38
+ log.exception("Security pipeline failed")
39
+
40
+ try:
41
+ from src.pipelines.github import run_github_pipeline
42
+ run_github_pipeline()
43
+ except Exception:
44
+ log.exception("GitHub pipeline failed")
45
+
46
+ try:
47
+ from src.pipelines.events import run_events_pipeline
48
+ run_events_pipeline()
49
+ except Exception:
50
+ log.exception("Events pipeline failed")
51
+
52
+ log.info("Weekly run complete")
53
+
54
+
55
+ def start_scheduler():
56
+ """Start the background scheduler with weekly job."""
57
+ # Sunday 22:00 UTC — dashboard ready Monday morning
58
+ scheduler.add_job(
59
+ weekly_run,
60
+ trigger=CronTrigger(day_of_week="sun", hour=22, minute=0),
61
+ id="weekly_run",
62
+ name="Weekly research pipeline",
63
+ replace_existing=True,
64
+ )
65
+ scheduler.start()
66
+ log.info("Started — weekly run scheduled for Sunday 22:00 UTC")
src/scoring.py ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Unified Claude API scoring for both AI/ML and security domains."""
2
+
3
+ import json
4
+ import logging
5
+ import re
6
+ import time
7
+
8
+ import anthropic
9
+
10
+ log = logging.getLogger(__name__)
11
+
12
+ from src.config import (
13
+ ANTHROPIC_API_KEY,
14
+ BATCH_SIZE,
15
+ CLAUDE_MODEL,
16
+ MAX_ABSTRACT_CHARS_AIML,
17
+ MAX_ABSTRACT_CHARS_SECURITY,
18
+ SCORING_CONFIGS,
19
+ SECURITY_LLM_RE,
20
+ )
21
+ from src.db import get_unscored_papers, update_paper_scores
22
+
23
+
24
+ def score_run(run_id: int, domain: str) -> int:
25
+ """Score all unscored papers in a run. Returns count of scored papers."""
26
+ if not ANTHROPIC_API_KEY:
27
+ log.warning("ANTHROPIC_API_KEY not set — skipping scoring")
28
+ return 0
29
+
30
+ config = SCORING_CONFIGS[domain]
31
+ papers = get_unscored_papers(run_id)
32
+
33
+ if not papers:
34
+ log.info("No unscored papers for run %d", run_id)
35
+ return 0
36
+
37
+ log.info("Scoring %d %s papers ...", len(papers), domain)
38
+
39
+ client = anthropic.Anthropic(timeout=120.0)
40
+ max_chars = MAX_ABSTRACT_CHARS_AIML if domain == "aiml" else MAX_ABSTRACT_CHARS_SECURITY
41
+ scored_count = 0
42
+
43
+ for i in range(0, len(papers), BATCH_SIZE):
44
+ batch = papers[i : i + BATCH_SIZE]
45
+ batch_num = i // BATCH_SIZE + 1
46
+ total_batches = (len(papers) + BATCH_SIZE - 1) // BATCH_SIZE
47
+ log.info("Batch %d/%d (%d papers) ...", batch_num, total_batches, len(batch))
48
+
49
+ # Build user content
50
+ user_content = _build_batch_content(batch, domain, max_chars)
51
+
52
+ # Call Claude
53
+ scores = _call_claude(client, config["prompt"], user_content)
54
+ if not scores:
55
+ continue
56
+
57
+ # Map scores back to papers and update DB
58
+ scored_count += _apply_scores(batch, scores, domain, config)
59
+
60
+ log.info("Scored %d/%d papers", scored_count, len(papers))
61
+ return scored_count
62
+
63
+
64
+ def _build_batch_content(papers: list[dict], domain: str, max_chars: int) -> str:
65
+ """Build the user content string for a batch of papers."""
66
+ lines = []
67
+ for p in papers:
68
+ abstract = (p.get("abstract") or "")[:max_chars]
69
+ id_field = p.get("entry_id") or p.get("arxiv_url") or p.get("arxiv_id", "")
70
+
71
+ lines.append("---")
72
+
73
+ if domain == "security":
74
+ lines.append(f"entry_id: {id_field}")
75
+ else:
76
+ lines.append(f"arxiv_id: {p.get('arxiv_id', '')}")
77
+
78
+ authors_list = p.get("authors", [])
79
+ if isinstance(authors_list, str):
80
+ authors_str = authors_list
81
+ else:
82
+ authors_str = ", ".join(authors_list[:5])
83
+
84
+ cats = p.get("categories", [])
85
+ if isinstance(cats, str):
86
+ cats_str = cats
87
+ else:
88
+ cats_str = ", ".join(cats)
89
+
90
+ lines.append(f"title: {p.get('title', '')}")
91
+ lines.append(f"authors: {authors_str}")
92
+ lines.append(f"categories: {cats_str}")
93
+
94
+ code_url = p.get("github_repo") or p.get("code_url") or "none found"
95
+ lines.append(f"code_url_found: {code_url}")
96
+
97
+ if domain == "security":
98
+ if "llm_adjacent" not in p:
99
+ text = f"{p.get('title', '')} {p.get('abstract', '')}"
100
+ p["llm_adjacent"] = bool(SECURITY_LLM_RE.search(text))
101
+ lines.append(f"llm_adjacent: {str(p['llm_adjacent']).lower()}")
102
+
103
+ if domain == "aiml":
104
+ lines.append(f"hf_upvotes: {p.get('hf_upvotes', 0)}")
105
+ hf_models = p.get("hf_models", [])
106
+ if hf_models:
107
+ model_ids = [m["id"] if isinstance(m, dict) else str(m) for m in hf_models[:3]]
108
+ lines.append(f"hf_models: {', '.join(model_ids)}")
109
+ hf_spaces = p.get("hf_spaces", [])
110
+ if hf_spaces:
111
+ space_ids = [s["id"] if isinstance(s, dict) else str(s) for s in hf_spaces[:3]]
112
+ lines.append(f"hf_spaces: {', '.join(space_ids)}")
113
+ lines.append(f"source: {p.get('source', 'unknown')}")
114
+
115
+ lines.append(f"abstract: {abstract}")
116
+ lines.append(f"comment: {p.get('comment', 'N/A')}")
117
+ lines.append("")
118
+
119
+ return "\n".join(lines)
120
+
121
+
122
+ def _call_claude(client: anthropic.Anthropic, system_prompt: str, user_content: str) -> list[dict]:
123
+ """Call Claude API and extract JSON response."""
124
+ for attempt in range(3):
125
+ try:
126
+ response = client.messages.create(
127
+ model=CLAUDE_MODEL,
128
+ max_tokens=4096,
129
+ system=system_prompt,
130
+ messages=[{"role": "user", "content": user_content}],
131
+ )
132
+ text = response.content[0].text
133
+ json_match = re.search(r"\[.*\]", text, re.DOTALL)
134
+ if json_match:
135
+ return json.loads(json_match.group())
136
+ log.warning("No JSON array in response (attempt %d)", attempt + 1)
137
+ except (anthropic.APIError, json.JSONDecodeError) as e:
138
+ log.error("Scoring API error (attempt %d): %s", attempt + 1, e)
139
+ if attempt < 2:
140
+ time.sleep(2 ** (attempt + 1))
141
+ else:
142
+ log.error("Skipping batch after 3 failures")
143
+ return []
144
+
145
+
146
+ def _apply_scores(papers: list[dict], scores: list[dict], domain: str, config: dict) -> int:
147
+ """Apply scores from Claude response to papers in DB. Returns count applied."""
148
+ axes = config["axes"]
149
+ weights = config["weights"]
150
+ weight_values = list(weights.values())
151
+
152
+ # Build lookup by ID
153
+ if domain == "security":
154
+ score_map = {s.get("entry_id", ""): s for s in scores}
155
+ else:
156
+ score_map = {s.get("arxiv_id", ""): s for s in scores}
157
+
158
+ applied = 0
159
+ for paper in papers:
160
+ if domain == "security":
161
+ key = paper.get("entry_id") or paper.get("arxiv_url") or ""
162
+ else:
163
+ key = paper.get("arxiv_id", "")
164
+
165
+ score = score_map.get(key)
166
+ if not score:
167
+ continue
168
+
169
+ # Extract axis scores
170
+ axis_scores = [score.get(ax, 0) for ax in axes]
171
+
172
+ # Compute composite
173
+ composite = sum(s * w for s, w in zip(axis_scores, weight_values))
174
+
175
+ update_paper_scores(paper["id"], {
176
+ "score_axis_1": axis_scores[0] if len(axis_scores) > 0 else None,
177
+ "score_axis_2": axis_scores[1] if len(axis_scores) > 1 else None,
178
+ "score_axis_3": axis_scores[2] if len(axis_scores) > 2 else None,
179
+ "composite": round(composite, 2),
180
+ "summary": score.get("summary", ""),
181
+ "reasoning": score.get("reasoning", ""),
182
+ "code_url": score.get("code_url"),
183
+ })
184
+ applied += 1
185
+
186
+ return applied
src/web/__init__.py ADDED
File without changes
src/web/app.py ADDED
@@ -0,0 +1,983 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """FastAPI web application — Research Intelligence Dashboard."""
2
+
3
+ import json
4
+ import logging
5
+ import os
6
+ import threading
7
+ from collections import defaultdict
8
+ from datetime import datetime, timezone
9
+ from pathlib import Path
10
+
11
+ from fastapi import FastAPI, Request
12
+ from fastapi.responses import HTMLResponse, JSONResponse, RedirectResponse
13
+ from fastapi.staticfiles import StaticFiles
14
+ from fastapi.templating import Jinja2Templates
15
+
16
+ log = logging.getLogger(__name__)
17
+
18
+ from starlette.middleware.base import BaseHTTPMiddleware
19
+
20
+ from src.config import SCORING_CONFIGS
21
+ from src.db import (
22
+ clear_preferences,
23
+ count_events,
24
+ count_github_projects,
25
+ count_papers,
26
+ delete_signal,
27
+ get_all_runs,
28
+ get_available_topics,
29
+ get_events,
30
+ get_github_languages,
31
+ get_github_projects_page,
32
+ get_latest_run,
33
+ get_paper,
34
+ get_paper_connections,
35
+ get_paper_signal,
36
+ get_papers_page,
37
+ get_preferences_detail,
38
+ get_preferences_updated_at,
39
+ get_signal_counts,
40
+ get_top_github_projects,
41
+ get_top_papers,
42
+ init_db,
43
+ insert_signal,
44
+ load_preferences,
45
+ )
46
+ from src.preferences import compute_preferences, enrich_papers_with_preferences
47
+
48
+ app = FastAPI(title="Research Intelligence")
49
+
50
+ # Static files & templates
51
+ STATIC_DIR = Path(__file__).parent / "static"
52
+ TEMPLATE_DIR = Path(__file__).parent / "templates"
53
+ DATA_DIR = Path("data")
54
+
55
+ app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
56
+ templates = Jinja2Templates(directory=str(TEMPLATE_DIR))
57
+
58
+
59
+ # ---------------------------------------------------------------------------
60
+ # First-run redirect middleware
61
+ # ---------------------------------------------------------------------------
62
+
63
+
64
+ class FirstRunMiddleware(BaseHTTPMiddleware):
65
+ """Redirect all non-setup requests to /setup when config.yaml is missing."""
66
+
67
+ _ALLOWED_PREFIXES = ("/setup", "/static", "/api/setup", "/sw.js")
68
+
69
+ async def dispatch(self, request: Request, call_next):
70
+ from src.config import FIRST_RUN
71
+ if FIRST_RUN:
72
+ path = request.url.path
73
+ if not any(path.startswith(p) for p in self._ALLOWED_PREFIXES):
74
+ return RedirectResponse("/setup", status_code=302)
75
+ return await call_next(request)
76
+
77
+
78
+ app.add_middleware(FirstRunMiddleware)
79
+
80
+
81
+ @app.get("/sw.js")
82
+ async def service_worker():
83
+ """Serve SW from root scope for PWA."""
84
+ from fastapi.responses import FileResponse
85
+ return FileResponse(
86
+ STATIC_DIR / "sw.js",
87
+ media_type="application/javascript",
88
+ headers={"Service-Worker-Allowed": "/"},
89
+ )
90
+
91
+
92
+ def score_bar(value, max_val=10):
93
+ """Render a visual score bar."""
94
+ if value is None or max_val == 0:
95
+ return "░" * 10
96
+ filled = round(float(value) * 10 / max_val)
97
+ filled = max(0, min(10, filled))
98
+ return "█" * filled + "░" * (10 - filled)
99
+
100
+
101
+ def format_date(value, fmt="short"):
102
+ """Format dates from various input formats (ISO, RFC 2822, etc.)."""
103
+ if not value:
104
+ return ""
105
+ from email.utils import parsedate_to_datetime
106
+ dt = None
107
+ # Try ISO format first
108
+ for pattern in ("%Y-%m-%dT%H:%M:%S%z", "%Y-%m-%dT%H:%M:%S", "%Y-%m-%d"):
109
+ try:
110
+ dt = datetime.strptime(value[:26], pattern)
111
+ break
112
+ except (ValueError, TypeError):
113
+ continue
114
+ # Try RFC 2822 (RSS dates like "Wed, 18 Feb 2026 21:00:00 GMT")
115
+ if dt is None:
116
+ try:
117
+ dt = parsedate_to_datetime(value)
118
+ except (ValueError, TypeError):
119
+ return value[:10] if len(value) >= 10 else value
120
+ if fmt == "short":
121
+ return dt.strftime("%Y-%m-%d")
122
+ elif fmt == "medium":
123
+ return dt.strftime("%b %d, %Y")
124
+ elif fmt == "long":
125
+ return dt.strftime("%a, %b %d %Y")
126
+ return dt.strftime("%Y-%m-%d")
127
+
128
+
129
+ def abbreviate_label(label):
130
+ """Abbreviate axis labels for table headers."""
131
+ abbrevs = {
132
+ "Code & Weights": "Code/Wt",
133
+ "Novelty": "Novel",
134
+ "Practical Applicability": "Practical",
135
+ "Has Code/PoC": "Code/PoC",
136
+ "Novel Attack Surface": "Attack",
137
+ "Real-World Impact": "Impact",
138
+ }
139
+ return abbrevs.get(label, label[:10])
140
+
141
+
142
+ # Register as Jinja2 globals/filters
143
+ templates.env.globals["score_bar"] = score_bar
144
+ templates.env.globals["abbreviate_label"] = abbreviate_label
145
+ templates.env.filters["format_date"] = format_date
146
+
147
+
148
+ @app.on_event("startup")
149
+ def startup():
150
+ from src.config import validate_env
151
+ validate_env()
152
+ init_db()
153
+ from src.scheduler import start_scheduler
154
+ start_scheduler()
155
+ log.info("Research Intelligence started")
156
+
157
+
158
+ @app.on_event("shutdown")
159
+ def shutdown():
160
+ from src.scheduler import scheduler
161
+ scheduler.shutdown(wait=False)
162
+ # Wait for running pipeline threads (up to 30s each)
163
+ for t in _pipeline_threads:
164
+ if t.is_alive():
165
+ log.info("Waiting for %s to finish ...", t.name)
166
+ t.join(timeout=30)
167
+ log.info("Research Intelligence stopped")
168
+
169
+
170
+ # ---------------------------------------------------------------------------
171
+ # Dashboard
172
+ # ---------------------------------------------------------------------------
173
+
174
+
175
+ @app.get("/", response_class=HTMLResponse)
176
+ async def dashboard(request: Request):
177
+ now = datetime.now(timezone.utc)
178
+ week_label = now.strftime("%b %d, %Y")
179
+
180
+ aiml_top = get_top_papers("aiml", limit=5)
181
+ security_top = get_top_papers("security", limit=5)
182
+
183
+ # Enrich dashboard cards with preference data
184
+ preferences = load_preferences()
185
+ if preferences:
186
+ aiml_top = enrich_papers_with_preferences(aiml_top, preferences)
187
+ security_top = enrich_papers_with_preferences(security_top, preferences)
188
+
189
+ aiml_run = get_latest_run("aiml")
190
+ security_run = get_latest_run("security")
191
+
192
+ last_run = None
193
+ for r in [aiml_run, security_run]:
194
+ if r and r.get("finished_at"):
195
+ ts = r["finished_at"][:16]
196
+ if last_run is None or ts > last_run:
197
+ last_run = ts
198
+
199
+ events = get_events(limit=50)
200
+ today = now.strftime("%Y-%m-%d")
201
+ # Deduplicate + filter past conference deadlines
202
+ events_grouped = defaultdict(list)
203
+ seen: dict[str, set] = defaultdict(set)
204
+ for e in events:
205
+ cat = e.get("category", "other")
206
+ title = e.get("title", "")
207
+ if title in seen[cat]:
208
+ continue
209
+ # Skip past conference deadlines
210
+ if cat == "conference" and (e.get("event_date") or "") < today:
211
+ continue
212
+ seen[cat].add(title)
213
+ events_grouped[cat].append(e)
214
+
215
+ with _pipeline_lock:
216
+ running = list(_running_pipelines)
217
+
218
+ # Show seed banner if few signals exist
219
+ signal_counts = get_signal_counts()
220
+ total_signals = sum(v for k, v in signal_counts.items() if k != "view")
221
+ show_seed_banner = total_signals < 5
222
+
223
+ return templates.TemplateResponse("dashboard.html", {
224
+ "request": request,
225
+ "active": "dashboard",
226
+ "week_label": week_label,
227
+ "aiml_count": count_papers("aiml", scored_only=True),
228
+ "security_count": count_papers("security", scored_only=True),
229
+ "github_count": count_github_projects(),
230
+ "event_count": count_events(),
231
+ "last_run": last_run,
232
+ "aiml_top": aiml_top,
233
+ "security_top": security_top,
234
+ "events": events,
235
+ "events_grouped": dict(events_grouped),
236
+ "running_pipelines": running,
237
+ "show_seed_banner": show_seed_banner,
238
+ })
239
+
240
+
241
+ # ---------------------------------------------------------------------------
242
+ # Papers list
243
+ # ---------------------------------------------------------------------------
244
+
245
+
246
+ @app.get("/papers/{domain}", response_class=HTMLResponse)
247
+ async def papers_list(
248
+ request: Request,
249
+ domain: str,
250
+ offset: int = 0,
251
+ limit: int = 50,
252
+ search: str | None = None,
253
+ min_score: float | None = None,
254
+ has_code: bool = False,
255
+ topic: str | None = None,
256
+ sort: str | None = None,
257
+ ):
258
+ if domain not in ("aiml", "security"):
259
+ return RedirectResponse("/")
260
+
261
+ config = SCORING_CONFIGS[domain]
262
+ run = get_latest_run(domain) or {}
263
+
264
+ # Load preferences to determine if personalized sort is available
265
+ preferences = load_preferences()
266
+ has_preferences = bool(preferences)
267
+
268
+ # Default to personalized sort when preferences exist
269
+ effective_sort = sort
270
+ if sort == "adjusted" and not has_preferences:
271
+ effective_sort = "score"
272
+
273
+ papers, total = get_papers_page(
274
+ domain, run_id=run.get("id"),
275
+ offset=offset, limit=limit,
276
+ min_score=min_score,
277
+ has_code=has_code if has_code else None,
278
+ search=search,
279
+ topic=topic,
280
+ sort=effective_sort if effective_sort != "adjusted" else "score",
281
+ )
282
+
283
+ # Enrich with preferences
284
+ sort_adjusted = (sort == "adjusted") and has_preferences
285
+ papers = enrich_papers_with_preferences(papers, preferences, sort_adjusted=sort_adjusted)
286
+
287
+ # Get available topics for the filter dropdown
288
+ available_topics = get_available_topics(domain, run.get("id", 0)) if run else []
289
+
290
+ domain_label = "AI/ML" if domain == "aiml" else "Security"
291
+
292
+ context = {
293
+ "request": request,
294
+ "active": domain,
295
+ "domain": domain,
296
+ "domain_label": domain_label,
297
+ "papers": papers,
298
+ "total": total,
299
+ "offset": offset,
300
+ "limit": limit,
301
+ "search": search,
302
+ "min_score": min_score,
303
+ "has_code": has_code,
304
+ "topic": topic,
305
+ "sort": sort,
306
+ "available_topics": available_topics,
307
+ "run": run,
308
+ "axis_labels": config["axis_labels"],
309
+ "has_preferences": has_preferences,
310
+ }
311
+
312
+ # Return partial for HTMX requests (filter / pagination)
313
+ if request.headers.get("HX-Request"):
314
+ return templates.TemplateResponse("partials/papers_results.html", context)
315
+
316
+ return templates.TemplateResponse("papers.html", context)
317
+
318
+
319
+ # ---------------------------------------------------------------------------
320
+ # Paper detail
321
+ # ---------------------------------------------------------------------------
322
+
323
+
324
+ @app.get("/papers/{domain}/{paper_id}", response_class=HTMLResponse)
325
+ async def paper_detail(request: Request, domain: str, paper_id: int):
326
+ paper = get_paper(paper_id)
327
+ if not paper:
328
+ return RedirectResponse(f"/papers/{domain}")
329
+
330
+ config = SCORING_CONFIGS.get(domain, SCORING_CONFIGS["aiml"])
331
+ domain_label = "AI/ML" if domain == "aiml" else "Security"
332
+
333
+ connections = get_paper_connections(paper_id)
334
+
335
+ # Record view signal (deduped by 5-min window)
336
+ insert_signal(paper_id, "view")
337
+
338
+ # Preference boost info
339
+ preferences = load_preferences()
340
+ papers_enriched = enrich_papers_with_preferences([paper], preferences)
341
+ paper = papers_enriched[0]
342
+
343
+ return templates.TemplateResponse("paper_detail.html", {
344
+ "request": request,
345
+ "active": domain,
346
+ "domain": domain,
347
+ "domain_label": domain_label,
348
+ "paper": paper,
349
+ "axis_labels": config["axis_labels"],
350
+ "score_bar": score_bar,
351
+ "connections": connections,
352
+ })
353
+
354
+
355
+ # ---------------------------------------------------------------------------
356
+ # Events
357
+ # ---------------------------------------------------------------------------
358
+
359
+
360
+ @app.get("/events", response_class=HTMLResponse)
361
+ async def events_page(request: Request):
362
+ deadlines_raw = get_events(category="conference", limit=50)
363
+ releases = get_events(category="release", limit=20)
364
+ news_raw = get_events(category="news", limit=40)
365
+
366
+ # Filter out past deadlines
367
+ today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
368
+ deadlines = [d for d in deadlines_raw if (d.get("event_date") or "") >= today]
369
+
370
+ # Deduplicate news by title and sort by date (RFC 2822 dates don't sort lexicographically)
371
+ from email.utils import parsedate_to_datetime as _parse_rfc
372
+ seen_titles: set[str] = set()
373
+ news: list[dict] = []
374
+ for n in news_raw:
375
+ t = n.get("title", "")
376
+ if t not in seen_titles:
377
+ seen_titles.add(t)
378
+ news.append(n)
379
+
380
+ def _news_sort_key(item):
381
+ d = item.get("event_date", "")
382
+ try:
383
+ return _parse_rfc(d)
384
+ except (ValueError, TypeError):
385
+ try:
386
+ return datetime.fromisoformat(d[:19])
387
+ except (ValueError, TypeError):
388
+ return datetime.min
389
+
390
+ news.sort(key=_news_sort_key, reverse=True)
391
+ news = news[:20]
392
+
393
+ return templates.TemplateResponse("events.html", {
394
+ "request": request,
395
+ "active": "events",
396
+ "total": count_events(),
397
+ "deadlines": deadlines,
398
+ "releases": releases,
399
+ "news": news,
400
+ })
401
+
402
+
403
+ # ---------------------------------------------------------------------------
404
+ # GitHub Projects
405
+ # ---------------------------------------------------------------------------
406
+
407
+
408
+ @app.get("/github", response_class=HTMLResponse)
409
+ async def github_page(
410
+ request: Request,
411
+ offset: int = 0,
412
+ limit: int = 50,
413
+ search: str | None = None,
414
+ language: str | None = None,
415
+ domain: str | None = None,
416
+ sort: str | None = None,
417
+ ):
418
+ run = get_latest_run("github") or {}
419
+
420
+ projects, total = get_github_projects_page(
421
+ run_id=run.get("id"),
422
+ offset=offset,
423
+ limit=limit,
424
+ search=search,
425
+ language=language,
426
+ domain=domain,
427
+ sort=sort,
428
+ )
429
+
430
+ available_languages = get_github_languages(run["id"]) if run else []
431
+
432
+ context = {
433
+ "request": request,
434
+ "active": "github",
435
+ "projects": projects,
436
+ "total": total,
437
+ "offset": offset,
438
+ "limit": limit,
439
+ "search": search,
440
+ "language": language,
441
+ "domain_filter": domain,
442
+ "sort": sort,
443
+ "available_languages": available_languages,
444
+ "run": run,
445
+ }
446
+
447
+ if request.headers.get("HX-Request"):
448
+ return templates.TemplateResponse("partials/github_results.html", context)
449
+
450
+ return templates.TemplateResponse("github.html", context)
451
+
452
+
453
+ # ---------------------------------------------------------------------------
454
+ # Archive
455
+ # ---------------------------------------------------------------------------
456
+
457
+
458
+ @app.get("/weeks", response_class=HTMLResponse)
459
+ async def weeks_page(request: Request):
460
+ weeks_dir = DATA_DIR / "weeks"
461
+ archives = []
462
+ if weeks_dir.exists():
463
+ for f in sorted(weeks_dir.glob("*.md"), reverse=True):
464
+ parts = f.stem.rsplit("-", 1)
465
+ domain = parts[-1] if len(parts) > 1 and parts[-1] in ("aiml", "security") else "unknown"
466
+ date = parts[0] if len(parts) > 1 else f.stem
467
+ archives.append({"filename": f.name, "date": date, "domain": domain})
468
+
469
+ runs = get_all_runs(limit=20)
470
+
471
+ return templates.TemplateResponse("weeks.html", {
472
+ "request": request,
473
+ "active": "weeks",
474
+ "archives": archives,
475
+ "runs": runs,
476
+ })
477
+
478
+
479
+ @app.get("/weeks/{filename}", response_class=HTMLResponse)
480
+ async def weeks_file(filename: str):
481
+ import html as html_mod
482
+ filepath = (DATA_DIR / "weeks" / filename).resolve()
483
+ weeks_root = (DATA_DIR / "weeks").resolve()
484
+ if not filepath.is_relative_to(weeks_root) or not filepath.exists() or not filepath.suffix == ".md":
485
+ return RedirectResponse("/weeks")
486
+ content = html_mod.escape(filepath.read_text())
487
+ safe_name = html_mod.escape(filename)
488
+ page = f"""<!DOCTYPE html><html><head><title>{safe_name}</title>
489
+ <link rel="stylesheet" href="/static/style.css">
490
+ <style>body {{ padding: 2rem; max-width: 900px; margin: 0 auto; }}
491
+ pre, code {{ font-family: var(--font-mono); }} table {{ border-collapse: collapse; width: 100%; }}
492
+ th, td {{ border: 1px solid var(--border); padding: 0.5rem; text-align: left; }}</style>
493
+ </head><body><a href="/weeks">&larr; Back to archive</a>
494
+ <pre style="white-space:pre-wrap; line-height:1.7">{content}</pre></body></html>"""
495
+ return HTMLResponse(content=page)
496
+
497
+
498
+ # ---------------------------------------------------------------------------
499
+ # Pipeline triggers
500
+ # ---------------------------------------------------------------------------
501
+
502
+
503
+ _running_pipelines: set[str] = set()
504
+ _pipeline_lock = threading.Lock()
505
+ _pipeline_threads: list[threading.Thread] = []
506
+
507
+
508
+ def _enrich_s2(run_id: int, domain: str):
509
+ """Run S2 enrichment (best-effort, failures don't break pipeline)."""
510
+ try:
511
+ from src.pipelines.semantic_scholar import enrich_run
512
+ enrich_run(run_id, domain)
513
+ except Exception as e:
514
+ log.warning("S2 enrichment for %s run %d failed: %s", domain, run_id, e)
515
+
516
+
517
+ def _run_pipeline_bg(domain: str):
518
+ """Run a pipeline in a background thread."""
519
+ try:
520
+ if domain == "aiml":
521
+ from src.pipelines.aiml import run_aiml_pipeline
522
+ from src.scoring import score_run
523
+ run_id = run_aiml_pipeline()
524
+ score_run(run_id, "aiml")
525
+ _enrich_s2(run_id, "aiml")
526
+ _generate_report(run_id, "aiml")
527
+ elif domain == "security":
528
+ from src.pipelines.security import run_security_pipeline
529
+ from src.scoring import score_run
530
+ run_id = run_security_pipeline()
531
+ score_run(run_id, "security")
532
+ _enrich_s2(run_id, "security")
533
+ _generate_report(run_id, "security")
534
+ elif domain == "github":
535
+ from src.pipelines.github import run_github_pipeline
536
+ run_github_pipeline()
537
+ elif domain == "events":
538
+ from src.pipelines.events import run_events_pipeline
539
+ run_events_pipeline()
540
+ except Exception as e:
541
+ log.exception("Background pipeline %s failed", domain)
542
+ finally:
543
+ with _pipeline_lock:
544
+ _running_pipelines.discard(domain)
545
+
546
+
547
+ @app.post("/run/{domain}")
548
+ async def trigger_run(domain: str):
549
+ if domain not in ("aiml", "security", "github", "events"):
550
+ return RedirectResponse("/", status_code=303)
551
+ with _pipeline_lock:
552
+ if domain in _running_pipelines:
553
+ return RedirectResponse("/", status_code=303)
554
+ _running_pipelines.add(domain)
555
+ thread = threading.Thread(target=_run_pipeline_bg, args=(domain,), name=f"pipeline-{domain}")
556
+ thread.start()
557
+ _pipeline_threads.append(thread)
558
+ return RedirectResponse("/", status_code=303)
559
+
560
+
561
+ # ---------------------------------------------------------------------------
562
+ # API status
563
+ # ---------------------------------------------------------------------------
564
+
565
+
566
+ @app.get("/api/status")
567
+ async def api_status():
568
+ aiml_run = get_latest_run("aiml")
569
+ security_run = get_latest_run("security")
570
+ github_run = get_latest_run("github")
571
+ with _pipeline_lock:
572
+ running = list(_running_pipelines)
573
+ return {
574
+ "aiml": aiml_run,
575
+ "security": security_run,
576
+ "github": github_run,
577
+ "github_count": count_github_projects(),
578
+ "events_count": count_events(),
579
+ "running_pipelines": running,
580
+ }
581
+
582
+
583
+ # ---------------------------------------------------------------------------
584
+ # Preference signals
585
+ # ---------------------------------------------------------------------------
586
+
587
+
588
+ def _maybe_recompute_preferences():
589
+ """Recompute preferences if stale (>1 hour since last update)."""
590
+ updated_at = get_preferences_updated_at()
591
+ if updated_at:
592
+ try:
593
+ last = datetime.fromisoformat(updated_at.replace("Z", "+00:00"))
594
+ age_hours = (datetime.now(timezone.utc) - last).total_seconds() / 3600
595
+ if age_hours < 1:
596
+ return
597
+ except (ValueError, AttributeError):
598
+ pass
599
+ # Recompute in background thread
600
+ thread = threading.Thread(target=compute_preferences, name="pref-recompute")
601
+ thread.start()
602
+
603
+
604
+ @app.post("/api/signal/{paper_id}/{action}", response_class=HTMLResponse)
605
+ async def record_signal(request: Request, paper_id: int, action: str):
606
+ """Record a user signal. Returns HTMX partial with updated button state."""
607
+ if action not in ("save", "upvote", "downvote", "dismiss"):
608
+ return HTMLResponse("Invalid action", status_code=400)
609
+
610
+ paper = get_paper(paper_id)
611
+ if not paper:
612
+ return HTMLResponse("Paper not found", status_code=404)
613
+
614
+ # Toggle: if same signal exists, remove it
615
+ current = get_paper_signal(paper_id)
616
+ if current == action:
617
+ delete_signal(paper_id, action)
618
+ _maybe_recompute_preferences()
619
+ return templates.TemplateResponse("partials/signal_buttons.html", {
620
+ "request": request,
621
+ "paper_id": paper_id,
622
+ "user_signal": None,
623
+ })
624
+
625
+ # Remove conflicting signals (e.g., remove upvote if downvoting)
626
+ for conflicting in ("upvote", "downvote", "dismiss"):
627
+ if conflicting != action:
628
+ delete_signal(paper_id, conflicting)
629
+
630
+ insert_signal(paper_id, action)
631
+ _maybe_recompute_preferences()
632
+
633
+ return templates.TemplateResponse("partials/signal_buttons.html", {
634
+ "request": request,
635
+ "paper_id": paper_id,
636
+ "user_signal": action,
637
+ })
638
+
639
+
640
+ @app.get("/api/preferences")
641
+ async def api_preferences():
642
+ """Return preference profile as JSON."""
643
+ prefs = load_preferences()
644
+ counts = get_signal_counts()
645
+ return {"preferences": prefs, "signal_counts": counts}
646
+
647
+
648
+ @app.post("/api/preferences/recompute")
649
+ async def api_recompute_preferences():
650
+ """Force recompute preferences."""
651
+ prefs = compute_preferences()
652
+ return {"status": "ok", "preference_count": len(prefs)}
653
+
654
+
655
+ @app.post("/api/preferences/reset")
656
+ async def api_reset_preferences():
657
+ """Clear all signals and preferences."""
658
+ clear_preferences()
659
+ return {"status": "ok"}
660
+
661
+
662
+ @app.get("/preferences", response_class=HTMLResponse)
663
+ async def preferences_page(request: Request):
664
+ """User preferences dashboard."""
665
+ prefs_detail = get_preferences_detail()
666
+ counts = get_signal_counts()
667
+ updated_at = get_preferences_updated_at()
668
+
669
+ # Group preferences by type
670
+ grouped: dict[str, list[dict]] = defaultdict(list)
671
+ for p in prefs_detail:
672
+ prefix = p["pref_key"].split(":")[0]
673
+ name = p["pref_key"].split(":", 1)[1] if ":" in p["pref_key"] else p["pref_key"]
674
+ grouped[prefix].append({
675
+ "name": name,
676
+ "value": p["pref_value"],
677
+ "count": p["signal_count"],
678
+ })
679
+
680
+ return templates.TemplateResponse("preferences.html", {
681
+ "request": request,
682
+ "active": "preferences",
683
+ "grouped": dict(grouped),
684
+ "signal_counts": counts,
685
+ "updated_at": updated_at,
686
+ "total_prefs": len(prefs_detail),
687
+ })
688
+
689
+
690
+ # ---------------------------------------------------------------------------
691
+ # S2 enrichment trigger
692
+ # ---------------------------------------------------------------------------
693
+
694
+
695
+ @app.post("/run/enrich/{domain}")
696
+ async def trigger_enrich(domain: str):
697
+ """Trigger Semantic Scholar enrichment for the latest run."""
698
+ if domain not in ("aiml", "security"):
699
+ return RedirectResponse("/", status_code=303)
700
+
701
+ run = get_latest_run(domain)
702
+ if not run:
703
+ return RedirectResponse(f"/papers/{domain}", status_code=303)
704
+
705
+ with _pipeline_lock:
706
+ key = f"enrich-{domain}"
707
+ if key in _running_pipelines:
708
+ return RedirectResponse(f"/papers/{domain}", status_code=303)
709
+ _running_pipelines.add(key)
710
+
711
+ def _run():
712
+ try:
713
+ from src.pipelines.semantic_scholar import enrich_run
714
+ enrich_run(run["id"], domain)
715
+ except Exception as e:
716
+ log.warning("S2 enrichment for %s failed: %s", domain, e)
717
+ finally:
718
+ with _pipeline_lock:
719
+ _running_pipelines.discard(key)
720
+
721
+ thread = threading.Thread(target=_run)
722
+ thread.start()
723
+ return RedirectResponse(f"/papers/{domain}", status_code=303)
724
+
725
+
726
+ # ---------------------------------------------------------------------------
727
+ # Setup wizard
728
+ # ---------------------------------------------------------------------------
729
+
730
+
731
+ @app.get("/setup", response_class=HTMLResponse)
732
+ async def setup_page(request: Request):
733
+ """First-time setup wizard."""
734
+ return templates.TemplateResponse("setup.html", {"request": request})
735
+
736
+
737
+ @app.post("/api/setup/validate-key")
738
+ async def validate_api_key(request: Request):
739
+ """Validate an Anthropic API key with a test call."""
740
+ try:
741
+ body = await request.json()
742
+ key = body.get("api_key", "").strip()
743
+ if not key:
744
+ return JSONResponse({"valid": False, "error": "No key provided"})
745
+
746
+ import anthropic
747
+ client = anthropic.Anthropic(api_key=key, timeout=15.0)
748
+ client.messages.create(
749
+ model="claude-haiku-4-5-20251001",
750
+ max_tokens=10,
751
+ messages=[{"role": "user", "content": "Hi"}],
752
+ )
753
+ return JSONResponse({"valid": True})
754
+ except Exception as e:
755
+ return JSONResponse({"valid": False, "error": str(e)[:100]})
756
+
757
+
758
+ @app.post("/api/setup/save")
759
+ async def save_setup(request: Request):
760
+ """Save setup wizard config to config.yaml and .env."""
761
+ try:
762
+ body = await request.json()
763
+ api_key = body.get("api_key", "").strip()
764
+
765
+ # Write API key to .env (never in config.yaml)
766
+ if api_key:
767
+ env_path = Path(".env")
768
+ env_lines = []
769
+ if env_path.exists():
770
+ for line in env_path.read_text().splitlines():
771
+ if not line.startswith("ANTHROPIC_API_KEY="):
772
+ env_lines.append(line)
773
+ env_lines.append(f"ANTHROPIC_API_KEY={api_key}")
774
+ env_path.write_text("\n".join(env_lines) + "\n")
775
+ # Also set in current process
776
+ os.environ["ANTHROPIC_API_KEY"] = api_key
777
+ import src.config
778
+ src.config.ANTHROPIC_API_KEY = api_key
779
+
780
+ # Build config.yaml
781
+ domains_data = body.get("domains", {})
782
+ schedule_cron = body.get("schedule", "0 22 * * 0")
783
+
784
+ config_data = {
785
+ "domains": {
786
+ "aiml": {
787
+ "enabled": domains_data.get("aiml", {}).get("enabled", True),
788
+ "label": "AI / ML",
789
+ "sources": ["huggingface", "arxiv"],
790
+ "arxiv_categories": ["cs.CV", "cs.CL", "cs.LG"],
791
+ "scoring_axes": _build_axes_config("aiml", domains_data),
792
+ "include_patterns": [],
793
+ "exclude_patterns": [],
794
+ "preferences": {"boost_topics": [], "penalize_topics": []},
795
+ },
796
+ "security": {
797
+ "enabled": domains_data.get("security", {}).get("enabled", True),
798
+ "label": "Security",
799
+ "sources": ["arxiv"],
800
+ "arxiv_categories": ["cs.CR"],
801
+ "scoring_axes": _build_axes_config("security", domains_data),
802
+ "include_patterns": [],
803
+ "exclude_patterns": [],
804
+ "preferences": {"boost_topics": [], "penalize_topics": []},
805
+ },
806
+ },
807
+ "github": {"enabled": body.get("github", {}).get("enabled", True)},
808
+ "schedule": {"cron": schedule_cron} if schedule_cron else {"cron": ""},
809
+ "database": {"path": "data/researcher.db"},
810
+ "web": {"host": "0.0.0.0", "port": 8888},
811
+ }
812
+
813
+ from src.config import save_config
814
+ save_config(config_data)
815
+
816
+ return JSONResponse({"status": "ok"})
817
+ except Exception as e:
818
+ log.exception("Setup save failed")
819
+ return JSONResponse({"status": "error", "error": str(e)[:200]})
820
+
821
+
822
+ def _build_axes_config(domain: str, domains_data: dict) -> list[dict]:
823
+ """Build scoring axes config from wizard form data."""
824
+ d = domains_data.get(domain, {})
825
+ weights = d.get("scoring_weights", [])
826
+
827
+ if domain == "aiml":
828
+ defaults = [
829
+ {"name": "Code & Weights", "weight": 0.30, "description": "Open weights on HF, code on GitHub"},
830
+ {"name": "Novelty", "weight": 0.35, "description": "Paradigm shifts over incremental"},
831
+ {"name": "Practical Applicability", "weight": 0.35, "description": "Usable by practitioners soon"},
832
+ ]
833
+ else:
834
+ defaults = [
835
+ {"name": "Has Code/PoC", "weight": 0.25, "description": "Working tools, repos, artifacts"},
836
+ {"name": "Novel Attack Surface", "weight": 0.40, "description": "First-of-kind research"},
837
+ {"name": "Real-World Impact", "weight": 0.35, "description": "Affects production systems"},
838
+ ]
839
+
840
+ for i, ax in enumerate(defaults):
841
+ if i < len(weights):
842
+ ax["weight"] = round(weights[i], 2)
843
+
844
+ return defaults
845
+
846
+
847
+ # ---------------------------------------------------------------------------
848
+ # Seed preferences
849
+ # ---------------------------------------------------------------------------
850
+
851
+
852
+ @app.get("/seed-preferences", response_class=HTMLResponse)
853
+ async def seed_preferences_page(request: Request):
854
+ """Show seed papers for preference bootstrapping."""
855
+ seed_path = Path("data/seed_papers.json")
856
+ papers = []
857
+ if seed_path.exists():
858
+ papers = json.loads(seed_path.read_text())
859
+ return templates.TemplateResponse("seed_preferences.html", {
860
+ "request": request,
861
+ "active": "preferences",
862
+ "papers": papers,
863
+ })
864
+
865
+
866
+ @app.post("/api/seed-preferences")
867
+ async def save_seed_preferences(request: Request):
868
+ """Bulk-insert seed preference signals."""
869
+ body = await request.json()
870
+ ratings = body.get("ratings", {})
871
+
872
+ # Find papers in DB by arxiv_id
873
+ from src.db import get_conn
874
+ inserted = 0
875
+ with get_conn() as conn:
876
+ for arxiv_id, action in ratings.items():
877
+ if action not in ("upvote", "downvote"):
878
+ continue
879
+ row = conn.execute(
880
+ "SELECT id FROM papers WHERE arxiv_id=? LIMIT 1",
881
+ (arxiv_id,),
882
+ ).fetchone()
883
+ if row:
884
+ insert_signal(row["id"], action)
885
+ inserted += 1
886
+
887
+ if inserted > 0:
888
+ compute_preferences()
889
+
890
+ return JSONResponse({"status": "ok", "count": inserted})
891
+
892
+
893
+ # ---------------------------------------------------------------------------
894
+ # Report generation
895
+ # ---------------------------------------------------------------------------
896
+
897
+
898
+ def _generate_report(run_id: int, domain: str):
899
+ """Generate a markdown report and save to data/weeks/."""
900
+ from src.db import get_run
901
+ run = get_run(run_id)
902
+ if not run:
903
+ return
904
+
905
+ papers = get_top_papers(domain, run_id=run_id, limit=20)
906
+ if not papers:
907
+ return
908
+
909
+ config = SCORING_CONFIGS[domain]
910
+ axis_labels = config["axis_labels"]
911
+ date_start = run["date_start"]
912
+ date_end = run["date_end"]
913
+
914
+ if domain == "aiml":
915
+ title = f"AI/ML Research Weekly: {date_start} – {date_end}"
916
+ else:
917
+ title = f"Security Research Weekly: {date_start} – {date_end}"
918
+
919
+ lines = [f"# {title}\n\n"]
920
+ lines.append(f"> **{run.get('paper_count', len(papers))}** papers analyzed and scored.\n\n")
921
+
922
+ # Top 5
923
+ top5 = papers[:5]
924
+ honorable = papers[5:20]
925
+
926
+ lines.append("## Top Papers\n\n")
927
+ for i, p in enumerate(top5, 1):
928
+ authors = p.get("authors", [])
929
+ if isinstance(authors, str):
930
+ authors_str = authors
931
+ elif len(authors) > 3:
932
+ authors_str = ", ".join(authors[:3]) + " et al."
933
+ else:
934
+ authors_str = ", ".join(authors)
935
+
936
+ lines.append(f"### {i}. {p['title']}\n\n")
937
+ lines.append(f"**Authors:** {authors_str}\n")
938
+ arxiv_id = p.get("arxiv_id", "")
939
+ lines.append(f"**arXiv:** [{arxiv_id}](https://arxiv.org/abs/{arxiv_id})\n")
940
+ if p.get("code_url"):
941
+ lines.append(f"**Code:** [{p['code_url']}]({p['code_url']})\n")
942
+ lines.append("\n")
943
+
944
+ if p.get("summary"):
945
+ lines.append(f"> {p['summary']}\n\n")
946
+
947
+ lines.append("| Metric | Score | |\n|--------|-------|-|\n")
948
+ for j, label in enumerate(axis_labels):
949
+ val = p.get(f"score_axis_{j+1}", 0) or 0
950
+ bar = score_bar(val)
951
+ lines.append(f"| {label} | {val}/10 | `{bar}` |\n")
952
+ comp = p.get("composite", 0) or 0
953
+ lines.append(f"| **Composite** | **{comp}/10** | `{score_bar(comp)}` |\n\n")
954
+
955
+ if p.get("reasoning"):
956
+ lines.append(f"*{p['reasoning']}*\n\n")
957
+ lines.append("---\n\n")
958
+
959
+ # Honorable mentions
960
+ if honorable:
961
+ lines.append("## Honorable Mentions\n\n")
962
+ lines.append("| # | Paper | Score | Summary |\n")
963
+ lines.append("|---|-------|-------|---------|\n")
964
+ for i, p in enumerate(honorable, 6):
965
+ t = p["title"][:80].replace("|", "\\|")
966
+ if len(p["title"]) > 80:
967
+ t += "..."
968
+ s = (p.get("summary") or "")[:120].replace("|", "\\|")
969
+ if len(p.get("summary") or "") > 120:
970
+ s += "..."
971
+ aid = p.get("arxiv_id", "")
972
+ lines.append(f"| {i} | [{t}](https://arxiv.org/abs/{aid}) | {p.get('composite', 0)} | {s} |\n")
973
+ lines.append("\n")
974
+
975
+ lines.append("---\n*Generated by Research Intelligence*\n")
976
+
977
+ report = "".join(lines)
978
+
979
+ weeks_dir = DATA_DIR / "weeks"
980
+ weeks_dir.mkdir(parents=True, exist_ok=True)
981
+ filename = f"{date_start}-{domain}.md"
982
+ (weeks_dir / filename).write_text(report)
983
+ log.info("Report written to %s", weeks_dir / filename)
src/web/static/favicon-192.png ADDED
src/web/static/favicon-512.png ADDED
src/web/static/favicon.svg ADDED
src/web/static/htmx.min.js ADDED
@@ -0,0 +1 @@
 
 
1
+ var htmx=function(){"use strict";const Q={onLoad:null,process:null,on:null,off:null,trigger:null,ajax:null,find:null,findAll:null,closest:null,values:function(e,t){const n=cn(e,t||"post");return n.values},remove:null,addClass:null,removeClass:null,toggleClass:null,takeClass:null,swap:null,defineExtension:null,removeExtension:null,logAll:null,logNone:null,logger:null,config:{historyEnabled:true,historyCacheSize:10,refreshOnHistoryMiss:false,defaultSwapStyle:"innerHTML",defaultSwapDelay:0,defaultSettleDelay:20,includeIndicatorStyles:true,indicatorClass:"htmx-indicator",requestClass:"htmx-request",addedClass:"htmx-added",settlingClass:"htmx-settling",swappingClass:"htmx-swapping",allowEval:true,allowScriptTags:true,inlineScriptNonce:"",inlineStyleNonce:"",attributesToSettle:["class","style","width","height"],withCredentials:false,timeout:0,wsReconnectDelay:"full-jitter",wsBinaryType:"blob",disableSelector:"[hx-disable], [data-hx-disable]",scrollBehavior:"instant",defaultFocusScroll:false,getCacheBusterParam:false,globalViewTransitions:false,methodsThatUseUrlParams:["get","delete"],selfRequestsOnly:true,ignoreTitle:false,scrollIntoViewOnBoost:true,triggerSpecsCache:null,disableInheritance:false,responseHandling:[{code:"204",swap:false},{code:"[23]..",swap:true},{code:"[45]..",swap:false,error:true}],allowNestedOobSwaps:true},parseInterval:null,_:null,version:"2.0.4"};Q.onLoad=j;Q.process=kt;Q.on=ye;Q.off=be;Q.trigger=he;Q.ajax=Rn;Q.find=u;Q.findAll=x;Q.closest=g;Q.remove=z;Q.addClass=K;Q.removeClass=G;Q.toggleClass=W;Q.takeClass=Z;Q.swap=$e;Q.defineExtension=Fn;Q.removeExtension=Bn;Q.logAll=V;Q.logNone=_;Q.parseInterval=d;Q._=e;const n={addTriggerHandler:St,bodyContains:le,canAccessLocalStorage:B,findThisElement:Se,filterValues:hn,swap:$e,hasAttribute:s,getAttributeValue:te,getClosestAttributeValue:re,getClosestMatch:o,getExpressionVars:En,getHeaders:fn,getInputValues:cn,getInternalData:ie,getSwapSpecification:gn,getTriggerSpecs:st,getTarget:Ee,makeFragment:P,mergeObjects:ce,makeSettleInfo:xn,oobSwap:He,querySelectorExt:ae,settleImmediately:Kt,shouldCancel:ht,triggerEvent:he,triggerErrorEvent:fe,withExtensions:Ft};const r=["get","post","put","delete","patch"];const H=r.map(function(e){return"[hx-"+e+"], [data-hx-"+e+"]"}).join(", ");function d(e){if(e==undefined){return undefined}let t=NaN;if(e.slice(-2)=="ms"){t=parseFloat(e.slice(0,-2))}else if(e.slice(-1)=="s"){t=parseFloat(e.slice(0,-1))*1e3}else if(e.slice(-1)=="m"){t=parseFloat(e.slice(0,-1))*1e3*60}else{t=parseFloat(e)}return isNaN(t)?undefined:t}function ee(e,t){return e instanceof Element&&e.getAttribute(t)}function s(e,t){return!!e.hasAttribute&&(e.hasAttribute(t)||e.hasAttribute("data-"+t))}function te(e,t){return ee(e,t)||ee(e,"data-"+t)}function c(e){const t=e.parentElement;if(!t&&e.parentNode instanceof ShadowRoot)return e.parentNode;return t}function ne(){return document}function m(e,t){return e.getRootNode?e.getRootNode({composed:t}):ne()}function o(e,t){while(e&&!t(e)){e=c(e)}return e||null}function i(e,t,n){const r=te(t,n);const o=te(t,"hx-disinherit");var i=te(t,"hx-inherit");if(e!==t){if(Q.config.disableInheritance){if(i&&(i==="*"||i.split(" ").indexOf(n)>=0)){return r}else{return null}}if(o&&(o==="*"||o.split(" ").indexOf(n)>=0)){return"unset"}}return r}function re(t,n){let r=null;o(t,function(e){return!!(r=i(t,ue(e),n))});if(r!=="unset"){return r}}function h(e,t){const n=e instanceof Element&&(e.matches||e.matchesSelector||e.msMatchesSelector||e.mozMatchesSelector||e.webkitMatchesSelector||e.oMatchesSelector);return!!n&&n.call(e,t)}function T(e){const t=/<([a-z][^\/\0>\x20\t\r\n\f]*)/i;const n=t.exec(e);if(n){return n[1].toLowerCase()}else{return""}}function q(e){const t=new DOMParser;return t.parseFromString(e,"text/html")}function L(e,t){while(t.childNodes.length>0){e.append(t.childNodes[0])}}function A(e){const t=ne().createElement("script");se(e.attributes,function(e){t.setAttribute(e.name,e.value)});t.textContent=e.textContent;t.async=false;if(Q.config.inlineScriptNonce){t.nonce=Q.config.inlineScriptNonce}return t}function N(e){return e.matches("script")&&(e.type==="text/javascript"||e.type==="module"||e.type==="")}function I(e){Array.from(e.querySelectorAll("script")).forEach(e=>{if(N(e)){const t=A(e);const n=e.parentNode;try{n.insertBefore(t,e)}catch(e){O(e)}finally{e.remove()}}})}function P(e){const t=e.replace(/<head(\s[^>]*)?>[\s\S]*?<\/head>/i,"");const n=T(t);let r;if(n==="html"){r=new DocumentFragment;const i=q(e);L(r,i.body);r.title=i.title}else if(n==="body"){r=new DocumentFragment;const i=q(t);L(r,i.body);r.title=i.title}else{const i=q('<body><template class="internal-htmx-wrapper">'+t+"</template></body>");r=i.querySelector("template").content;r.title=i.title;var o=r.querySelector("title");if(o&&o.parentNode===r){o.remove();r.title=o.innerText}}if(r){if(Q.config.allowScriptTags){I(r)}else{r.querySelectorAll("script").forEach(e=>e.remove())}}return r}function oe(e){if(e){e()}}function t(e,t){return Object.prototype.toString.call(e)==="[object "+t+"]"}function k(e){return typeof e==="function"}function D(e){return t(e,"Object")}function ie(e){const t="htmx-internal-data";let n=e[t];if(!n){n=e[t]={}}return n}function M(t){const n=[];if(t){for(let e=0;e<t.length;e++){n.push(t[e])}}return n}function se(t,n){if(t){for(let e=0;e<t.length;e++){n(t[e])}}}function X(e){const t=e.getBoundingClientRect();const n=t.top;const r=t.bottom;return n<window.innerHeight&&r>=0}function le(e){return e.getRootNode({composed:true})===document}function F(e){return e.trim().split(/\s+/)}function ce(e,t){for(const n in t){if(t.hasOwnProperty(n)){e[n]=t[n]}}return e}function S(e){try{return JSON.parse(e)}catch(e){O(e);return null}}function B(){const e="htmx:localStorageTest";try{localStorage.setItem(e,e);localStorage.removeItem(e);return true}catch(e){return false}}function U(t){try{const e=new URL(t);if(e){t=e.pathname+e.search}if(!/^\/$/.test(t)){t=t.replace(/\/+$/,"")}return t}catch(e){return t}}function e(e){return vn(ne().body,function(){return eval(e)})}function j(t){const e=Q.on("htmx:load",function(e){t(e.detail.elt)});return e}function V(){Q.logger=function(e,t,n){if(console){console.log(t,e,n)}}}function _(){Q.logger=null}function u(e,t){if(typeof e!=="string"){return e.querySelector(t)}else{return u(ne(),e)}}function x(e,t){if(typeof e!=="string"){return e.querySelectorAll(t)}else{return x(ne(),e)}}function E(){return window}function z(e,t){e=y(e);if(t){E().setTimeout(function(){z(e);e=null},t)}else{c(e).removeChild(e)}}function ue(e){return e instanceof Element?e:null}function $(e){return e instanceof HTMLElement?e:null}function J(e){return typeof e==="string"?e:null}function f(e){return e instanceof Element||e instanceof Document||e instanceof DocumentFragment?e:null}function K(e,t,n){e=ue(y(e));if(!e){return}if(n){E().setTimeout(function(){K(e,t);e=null},n)}else{e.classList&&e.classList.add(t)}}function G(e,t,n){let r=ue(y(e));if(!r){return}if(n){E().setTimeout(function(){G(r,t);r=null},n)}else{if(r.classList){r.classList.remove(t);if(r.classList.length===0){r.removeAttribute("class")}}}}function W(e,t){e=y(e);e.classList.toggle(t)}function Z(e,t){e=y(e);se(e.parentElement.children,function(e){G(e,t)});K(ue(e),t)}function g(e,t){e=ue(y(e));if(e&&e.closest){return e.closest(t)}else{do{if(e==null||h(e,t)){return e}}while(e=e&&ue(c(e)));return null}}function l(e,t){return e.substring(0,t.length)===t}function Y(e,t){return e.substring(e.length-t.length)===t}function ge(e){const t=e.trim();if(l(t,"<")&&Y(t,"/>")){return t.substring(1,t.length-2)}else{return t}}function p(t,r,n){if(r.indexOf("global ")===0){return p(t,r.slice(7),true)}t=y(t);const o=[];{let t=0;let n=0;for(let e=0;e<r.length;e++){const l=r[e];if(l===","&&t===0){o.push(r.substring(n,e));n=e+1;continue}if(l==="<"){t++}else if(l==="/"&&e<r.length-1&&r[e+1]===">"){t--}}if(n<r.length){o.push(r.substring(n))}}const i=[];const s=[];while(o.length>0){const r=ge(o.shift());let e;if(r.indexOf("closest ")===0){e=g(ue(t),ge(r.substr(8)))}else if(r.indexOf("find ")===0){e=u(f(t),ge(r.substr(5)))}else if(r==="next"||r==="nextElementSibling"){e=ue(t).nextElementSibling}else if(r.indexOf("next ")===0){e=pe(t,ge(r.substr(5)),!!n)}else if(r==="previous"||r==="previousElementSibling"){e=ue(t).previousElementSibling}else if(r.indexOf("previous ")===0){e=me(t,ge(r.substr(9)),!!n)}else if(r==="document"){e=document}else if(r==="window"){e=window}else if(r==="body"){e=document.body}else if(r==="root"){e=m(t,!!n)}else if(r==="host"){e=t.getRootNode().host}else{s.push(r)}if(e){i.push(e)}}if(s.length>0){const e=s.join(",");const c=f(m(t,!!n));i.push(...M(c.querySelectorAll(e)))}return i}var pe=function(t,e,n){const r=f(m(t,n)).querySelectorAll(e);for(let e=0;e<r.length;e++){const o=r[e];if(o.compareDocumentPosition(t)===Node.DOCUMENT_POSITION_PRECEDING){return o}}};var me=function(t,e,n){const r=f(m(t,n)).querySelectorAll(e);for(let e=r.length-1;e>=0;e--){const o=r[e];if(o.compareDocumentPosition(t)===Node.DOCUMENT_POSITION_FOLLOWING){return o}}};function ae(e,t){if(typeof e!=="string"){return p(e,t)[0]}else{return p(ne().body,e)[0]}}function y(e,t){if(typeof e==="string"){return u(f(t)||document,e)}else{return e}}function xe(e,t,n,r){if(k(t)){return{target:ne().body,event:J(e),listener:t,options:n}}else{return{target:y(e),event:J(t),listener:n,options:r}}}function ye(t,n,r,o){Vn(function(){const e=xe(t,n,r,o);e.target.addEventListener(e.event,e.listener,e.options)});const e=k(n);return e?n:r}function be(t,n,r){Vn(function(){const e=xe(t,n,r);e.target.removeEventListener(e.event,e.listener)});return k(n)?n:r}const ve=ne().createElement("output");function we(e,t){const n=re(e,t);if(n){if(n==="this"){return[Se(e,t)]}else{const r=p(e,n);if(r.length===0){O('The selector "'+n+'" on '+t+" returned no matches!");return[ve]}else{return r}}}}function Se(e,t){return ue(o(e,function(e){return te(ue(e),t)!=null}))}function Ee(e){const t=re(e,"hx-target");if(t){if(t==="this"){return Se(e,"hx-target")}else{return ae(e,t)}}else{const n=ie(e);if(n.boosted){return ne().body}else{return e}}}function Ce(t){const n=Q.config.attributesToSettle;for(let e=0;e<n.length;e++){if(t===n[e]){return true}}return false}function Oe(t,n){se(t.attributes,function(e){if(!n.hasAttribute(e.name)&&Ce(e.name)){t.removeAttribute(e.name)}});se(n.attributes,function(e){if(Ce(e.name)){t.setAttribute(e.name,e.value)}})}function Re(t,e){const n=Un(e);for(let e=0;e<n.length;e++){const r=n[e];try{if(r.isInlineSwap(t)){return true}}catch(e){O(e)}}return t==="outerHTML"}function He(e,o,i,t){t=t||ne();let n="#"+ee(o,"id");let s="outerHTML";if(e==="true"){}else if(e.indexOf(":")>0){s=e.substring(0,e.indexOf(":"));n=e.substring(e.indexOf(":")+1)}else{s=e}o.removeAttribute("hx-swap-oob");o.removeAttribute("data-hx-swap-oob");const r=p(t,n,false);if(r){se(r,function(e){let t;const n=o.cloneNode(true);t=ne().createDocumentFragment();t.appendChild(n);if(!Re(s,e)){t=f(n)}const r={shouldSwap:true,target:e,fragment:t};if(!he(e,"htmx:oobBeforeSwap",r))return;e=r.target;if(r.shouldSwap){qe(t);_e(s,e,e,t,i);Te()}se(i.elts,function(e){he(e,"htmx:oobAfterSwap",r)})});o.parentNode.removeChild(o)}else{o.parentNode.removeChild(o);fe(ne().body,"htmx:oobErrorNoTarget",{content:o})}return e}function Te(){const e=u("#--htmx-preserve-pantry--");if(e){for(const t of[...e.children]){const n=u("#"+t.id);n.parentNode.moveBefore(t,n);n.remove()}e.remove()}}function qe(e){se(x(e,"[hx-preserve], [data-hx-preserve]"),function(e){const t=te(e,"id");const n=ne().getElementById(t);if(n!=null){if(e.moveBefore){let e=u("#--htmx-preserve-pantry--");if(e==null){ne().body.insertAdjacentHTML("afterend","<div id='--htmx-preserve-pantry--'></div>");e=u("#--htmx-preserve-pantry--")}e.moveBefore(n,null)}else{e.parentNode.replaceChild(n,e)}}})}function Le(l,e,c){se(e.querySelectorAll("[id]"),function(t){const n=ee(t,"id");if(n&&n.length>0){const r=n.replace("'","\\'");const o=t.tagName.replace(":","\\:");const e=f(l);const i=e&&e.querySelector(o+"[id='"+r+"']");if(i&&i!==e){const s=t.cloneNode();Oe(t,i);c.tasks.push(function(){Oe(t,s)})}}})}function Ae(e){return function(){G(e,Q.config.addedClass);kt(ue(e));Ne(f(e));he(e,"htmx:load")}}function Ne(e){const t="[autofocus]";const n=$(h(e,t)?e:e.querySelector(t));if(n!=null){n.focus()}}function a(e,t,n,r){Le(e,n,r);while(n.childNodes.length>0){const o=n.firstChild;K(ue(o),Q.config.addedClass);e.insertBefore(o,t);if(o.nodeType!==Node.TEXT_NODE&&o.nodeType!==Node.COMMENT_NODE){r.tasks.push(Ae(o))}}}function Ie(e,t){let n=0;while(n<e.length){t=(t<<5)-t+e.charCodeAt(n++)|0}return t}function Pe(t){let n=0;if(t.attributes){for(let e=0;e<t.attributes.length;e++){const r=t.attributes[e];if(r.value){n=Ie(r.name,n);n=Ie(r.value,n)}}}return n}function ke(t){const n=ie(t);if(n.onHandlers){for(let e=0;e<n.onHandlers.length;e++){const r=n.onHandlers[e];be(t,r.event,r.listener)}delete n.onHandlers}}function De(e){const t=ie(e);if(t.timeout){clearTimeout(t.timeout)}if(t.listenerInfos){se(t.listenerInfos,function(e){if(e.on){be(e.on,e.trigger,e.listener)}})}ke(e);se(Object.keys(t),function(e){if(e!=="firstInitCompleted")delete t[e]})}function b(e){he(e,"htmx:beforeCleanupElement");De(e);if(e.children){se(e.children,function(e){b(e)})}}function Me(t,e,n){if(t instanceof Element&&t.tagName==="BODY"){return Ve(t,e,n)}let r;const o=t.previousSibling;const i=c(t);if(!i){return}a(i,t,e,n);if(o==null){r=i.firstChild}else{r=o.nextSibling}n.elts=n.elts.filter(function(e){return e!==t});while(r&&r!==t){if(r instanceof Element){n.elts.push(r)}r=r.nextSibling}b(t);if(t instanceof Element){t.remove()}else{t.parentNode.removeChild(t)}}function Xe(e,t,n){return a(e,e.firstChild,t,n)}function Fe(e,t,n){return a(c(e),e,t,n)}function Be(e,t,n){return a(e,null,t,n)}function Ue(e,t,n){return a(c(e),e.nextSibling,t,n)}function je(e){b(e);const t=c(e);if(t){return t.removeChild(e)}}function Ve(e,t,n){const r=e.firstChild;a(e,r,t,n);if(r){while(r.nextSibling){b(r.nextSibling);e.removeChild(r.nextSibling)}b(r);e.removeChild(r)}}function _e(t,e,n,r,o){switch(t){case"none":return;case"outerHTML":Me(n,r,o);return;case"afterbegin":Xe(n,r,o);return;case"beforebegin":Fe(n,r,o);return;case"beforeend":Be(n,r,o);return;case"afterend":Ue(n,r,o);return;case"delete":je(n);return;default:var i=Un(e);for(let e=0;e<i.length;e++){const s=i[e];try{const l=s.handleSwap(t,n,r,o);if(l){if(Array.isArray(l)){for(let e=0;e<l.length;e++){const c=l[e];if(c.nodeType!==Node.TEXT_NODE&&c.nodeType!==Node.COMMENT_NODE){o.tasks.push(Ae(c))}}}return}}catch(e){O(e)}}if(t==="innerHTML"){Ve(n,r,o)}else{_e(Q.config.defaultSwapStyle,e,n,r,o)}}}function ze(e,n,r){var t=x(e,"[hx-swap-oob], [data-hx-swap-oob]");se(t,function(e){if(Q.config.allowNestedOobSwaps||e.parentElement===null){const t=te(e,"hx-swap-oob");if(t!=null){He(t,e,n,r)}}else{e.removeAttribute("hx-swap-oob");e.removeAttribute("data-hx-swap-oob")}});return t.length>0}function $e(e,t,r,o){if(!o){o={}}e=y(e);const i=o.contextElement?m(o.contextElement,false):ne();const n=document.activeElement;let s={};try{s={elt:n,start:n?n.selectionStart:null,end:n?n.selectionEnd:null}}catch(e){}const l=xn(e);if(r.swapStyle==="textContent"){e.textContent=t}else{let n=P(t);l.title=n.title;if(o.selectOOB){const u=o.selectOOB.split(",");for(let t=0;t<u.length;t++){const a=u[t].split(":",2);let e=a[0].trim();if(e.indexOf("#")===0){e=e.substring(1)}const f=a[1]||"true";const h=n.querySelector("#"+e);if(h){He(f,h,l,i)}}}ze(n,l,i);se(x(n,"template"),function(e){if(e.content&&ze(e.content,l,i)){e.remove()}});if(o.select){const d=ne().createDocumentFragment();se(n.querySelectorAll(o.select),function(e){d.appendChild(e)});n=d}qe(n);_e(r.swapStyle,o.contextElement,e,n,l);Te()}if(s.elt&&!le(s.elt)&&ee(s.elt,"id")){const g=document.getElementById(ee(s.elt,"id"));const p={preventScroll:r.focusScroll!==undefined?!r.focusScroll:!Q.config.defaultFocusScroll};if(g){if(s.start&&g.setSelectionRange){try{g.setSelectionRange(s.start,s.end)}catch(e){}}g.focus(p)}}e.classList.remove(Q.config.swappingClass);se(l.elts,function(e){if(e.classList){e.classList.add(Q.config.settlingClass)}he(e,"htmx:afterSwap",o.eventInfo)});if(o.afterSwapCallback){o.afterSwapCallback()}if(!r.ignoreTitle){kn(l.title)}const c=function(){se(l.tasks,function(e){e.call()});se(l.elts,function(e){if(e.classList){e.classList.remove(Q.config.settlingClass)}he(e,"htmx:afterSettle",o.eventInfo)});if(o.anchor){const e=ue(y("#"+o.anchor));if(e){e.scrollIntoView({block:"start",behavior:"auto"})}}yn(l.elts,r);if(o.afterSettleCallback){o.afterSettleCallback()}};if(r.settleDelay>0){E().setTimeout(c,r.settleDelay)}else{c()}}function Je(e,t,n){const r=e.getResponseHeader(t);if(r.indexOf("{")===0){const o=S(r);for(const i in o){if(o.hasOwnProperty(i)){let e=o[i];if(D(e)){n=e.target!==undefined?e.target:n}else{e={value:e}}he(n,i,e)}}}else{const s=r.split(",");for(let e=0;e<s.length;e++){he(n,s[e].trim(),[])}}}const Ke=/\s/;const v=/[\s,]/;const Ge=/[_$a-zA-Z]/;const We=/[_$a-zA-Z0-9]/;const Ze=['"',"'","/"];const w=/[^\s]/;const Ye=/[{(]/;const Qe=/[})]/;function et(e){const t=[];let n=0;while(n<e.length){if(Ge.exec(e.charAt(n))){var r=n;while(We.exec(e.charAt(n+1))){n++}t.push(e.substring(r,n+1))}else if(Ze.indexOf(e.charAt(n))!==-1){const o=e.charAt(n);var r=n;n++;while(n<e.length&&e.charAt(n)!==o){if(e.charAt(n)==="\\"){n++}n++}t.push(e.substring(r,n+1))}else{const i=e.charAt(n);t.push(i)}n++}return t}function tt(e,t,n){return Ge.exec(e.charAt(0))&&e!=="true"&&e!=="false"&&e!=="this"&&e!==n&&t!=="."}function nt(r,o,i){if(o[0]==="["){o.shift();let e=1;let t=" return (function("+i+"){ return (";let n=null;while(o.length>0){const s=o[0];if(s==="]"){e--;if(e===0){if(n===null){t=t+"true"}o.shift();t+=")})";try{const l=vn(r,function(){return Function(t)()},function(){return true});l.source=t;return l}catch(e){fe(ne().body,"htmx:syntax:error",{error:e,source:t});return null}}}else if(s==="["){e++}if(tt(s,n,i)){t+="(("+i+"."+s+") ? ("+i+"."+s+") : (window."+s+"))"}else{t=t+s}n=o.shift()}}}function C(e,t){let n="";while(e.length>0&&!t.test(e[0])){n+=e.shift()}return n}function rt(e){let t;if(e.length>0&&Ye.test(e[0])){e.shift();t=C(e,Qe).trim();e.shift()}else{t=C(e,v)}return t}const ot="input, textarea, select";function it(e,t,n){const r=[];const o=et(t);do{C(o,w);const l=o.length;const c=C(o,/[,\[\s]/);if(c!==""){if(c==="every"){const u={trigger:"every"};C(o,w);u.pollInterval=d(C(o,/[,\[\s]/));C(o,w);var i=nt(e,o,"event");if(i){u.eventFilter=i}r.push(u)}else{const a={trigger:c};var i=nt(e,o,"event");if(i){a.eventFilter=i}C(o,w);while(o.length>0&&o[0]!==","){const f=o.shift();if(f==="changed"){a.changed=true}else if(f==="once"){a.once=true}else if(f==="consume"){a.consume=true}else if(f==="delay"&&o[0]===":"){o.shift();a.delay=d(C(o,v))}else if(f==="from"&&o[0]===":"){o.shift();if(Ye.test(o[0])){var s=rt(o)}else{var s=C(o,v);if(s==="closest"||s==="find"||s==="next"||s==="previous"){o.shift();const h=rt(o);if(h.length>0){s+=" "+h}}}a.from=s}else if(f==="target"&&o[0]===":"){o.shift();a.target=rt(o)}else if(f==="throttle"&&o[0]===":"){o.shift();a.throttle=d(C(o,v))}else if(f==="queue"&&o[0]===":"){o.shift();a.queue=C(o,v)}else if(f==="root"&&o[0]===":"){o.shift();a[f]=rt(o)}else if(f==="threshold"&&o[0]===":"){o.shift();a[f]=C(o,v)}else{fe(e,"htmx:syntax:error",{token:o.shift()})}C(o,w)}r.push(a)}}if(o.length===l){fe(e,"htmx:syntax:error",{token:o.shift()})}C(o,w)}while(o[0]===","&&o.shift());if(n){n[t]=r}return r}function st(e){const t=te(e,"hx-trigger");let n=[];if(t){const r=Q.config.triggerSpecsCache;n=r&&r[t]||it(e,t,r)}if(n.length>0){return n}else if(h(e,"form")){return[{trigger:"submit"}]}else if(h(e,'input[type="button"], input[type="submit"]')){return[{trigger:"click"}]}else if(h(e,ot)){return[{trigger:"change"}]}else{return[{trigger:"click"}]}}function lt(e){ie(e).cancelled=true}function ct(e,t,n){const r=ie(e);r.timeout=E().setTimeout(function(){if(le(e)&&r.cancelled!==true){if(!gt(n,e,Mt("hx:poll:trigger",{triggerSpec:n,target:e}))){t(e)}ct(e,t,n)}},n.pollInterval)}function ut(e){return location.hostname===e.hostname&&ee(e,"href")&&ee(e,"href").indexOf("#")!==0}function at(e){return g(e,Q.config.disableSelector)}function ft(t,n,e){if(t instanceof HTMLAnchorElement&&ut(t)&&(t.target===""||t.target==="_self")||t.tagName==="FORM"&&String(ee(t,"method")).toLowerCase()!=="dialog"){n.boosted=true;let r,o;if(t.tagName==="A"){r="get";o=ee(t,"href")}else{const i=ee(t,"method");r=i?i.toLowerCase():"get";o=ee(t,"action");if(o==null||o===""){o=ne().location.href}if(r==="get"&&o.includes("?")){o=o.replace(/\?[^#]+/,"")}}e.forEach(function(e){pt(t,function(e,t){const n=ue(e);if(at(n)){b(n);return}de(r,o,n,t)},n,e,true)})}}function ht(e,t){const n=ue(t);if(!n){return false}if(e.type==="submit"||e.type==="click"){if(n.tagName==="FORM"){return true}if(h(n,'input[type="submit"], button')&&(h(n,"[form]")||g(n,"form")!==null)){return true}if(n instanceof HTMLAnchorElement&&n.href&&(n.getAttribute("href")==="#"||n.getAttribute("href").indexOf("#")!==0)){return true}}return false}function dt(e,t){return ie(e).boosted&&e instanceof HTMLAnchorElement&&t.type==="click"&&(t.ctrlKey||t.metaKey)}function gt(e,t,n){const r=e.eventFilter;if(r){try{return r.call(t,n)!==true}catch(e){const o=r.source;fe(ne().body,"htmx:eventFilter:error",{error:e,source:o});return true}}return false}function pt(l,c,e,u,a){const f=ie(l);let t;if(u.from){t=p(l,u.from)}else{t=[l]}if(u.changed){if(!("lastValue"in f)){f.lastValue=new WeakMap}t.forEach(function(e){if(!f.lastValue.has(u)){f.lastValue.set(u,new WeakMap)}f.lastValue.get(u).set(e,e.value)})}se(t,function(i){const s=function(e){if(!le(l)){i.removeEventListener(u.trigger,s);return}if(dt(l,e)){return}if(a||ht(e,l)){e.preventDefault()}if(gt(u,l,e)){return}const t=ie(e);t.triggerSpec=u;if(t.handledFor==null){t.handledFor=[]}if(t.handledFor.indexOf(l)<0){t.handledFor.push(l);if(u.consume){e.stopPropagation()}if(u.target&&e.target){if(!h(ue(e.target),u.target)){return}}if(u.once){if(f.triggeredOnce){return}else{f.triggeredOnce=true}}if(u.changed){const n=event.target;const r=n.value;const o=f.lastValue.get(u);if(o.has(n)&&o.get(n)===r){return}o.set(n,r)}if(f.delayed){clearTimeout(f.delayed)}if(f.throttle){return}if(u.throttle>0){if(!f.throttle){he(l,"htmx:trigger");c(l,e);f.throttle=E().setTimeout(function(){f.throttle=null},u.throttle)}}else if(u.delay>0){f.delayed=E().setTimeout(function(){he(l,"htmx:trigger");c(l,e)},u.delay)}else{he(l,"htmx:trigger");c(l,e)}}};if(e.listenerInfos==null){e.listenerInfos=[]}e.listenerInfos.push({trigger:u.trigger,listener:s,on:i});i.addEventListener(u.trigger,s)})}let mt=false;let xt=null;function yt(){if(!xt){xt=function(){mt=true};window.addEventListener("scroll",xt);window.addEventListener("resize",xt);setInterval(function(){if(mt){mt=false;se(ne().querySelectorAll("[hx-trigger*='revealed'],[data-hx-trigger*='revealed']"),function(e){bt(e)})}},200)}}function bt(e){if(!s(e,"data-hx-revealed")&&X(e)){e.setAttribute("data-hx-revealed","true");const t=ie(e);if(t.initHash){he(e,"revealed")}else{e.addEventListener("htmx:afterProcessNode",function(){he(e,"revealed")},{once:true})}}}function vt(e,t,n,r){const o=function(){if(!n.loaded){n.loaded=true;he(e,"htmx:trigger");t(e)}};if(r>0){E().setTimeout(o,r)}else{o()}}function wt(t,n,e){let i=false;se(r,function(r){if(s(t,"hx-"+r)){const o=te(t,"hx-"+r);i=true;n.path=o;n.verb=r;e.forEach(function(e){St(t,e,n,function(e,t){const n=ue(e);if(g(n,Q.config.disableSelector)){b(n);return}de(r,o,n,t)})})}});return i}function St(r,e,t,n){if(e.trigger==="revealed"){yt();pt(r,n,t,e);bt(ue(r))}else if(e.trigger==="intersect"){const o={};if(e.root){o.root=ae(r,e.root)}if(e.threshold){o.threshold=parseFloat(e.threshold)}const i=new IntersectionObserver(function(t){for(let e=0;e<t.length;e++){const n=t[e];if(n.isIntersecting){he(r,"intersect");break}}},o);i.observe(ue(r));pt(ue(r),n,t,e)}else if(!t.firstInitCompleted&&e.trigger==="load"){if(!gt(e,r,Mt("load",{elt:r}))){vt(ue(r),n,t,e.delay)}}else if(e.pollInterval>0){t.polling=true;ct(ue(r),n,e)}else{pt(r,n,t,e)}}function Et(e){const t=ue(e);if(!t){return false}const n=t.attributes;for(let e=0;e<n.length;e++){const r=n[e].name;if(l(r,"hx-on:")||l(r,"data-hx-on:")||l(r,"hx-on-")||l(r,"data-hx-on-")){return true}}return false}const Ct=(new XPathEvaluator).createExpression('.//*[@*[ starts-with(name(), "hx-on:") or starts-with(name(), "data-hx-on:") or'+' starts-with(name(), "hx-on-") or starts-with(name(), "data-hx-on-") ]]');function Ot(e,t){if(Et(e)){t.push(ue(e))}const n=Ct.evaluate(e);let r=null;while(r=n.iterateNext())t.push(ue(r))}function Rt(e){const t=[];if(e instanceof DocumentFragment){for(const n of e.childNodes){Ot(n,t)}}else{Ot(e,t)}return t}function Ht(e){if(e.querySelectorAll){const n=", [hx-boost] a, [data-hx-boost] a, a[hx-boost], a[data-hx-boost]";const r=[];for(const i in Mn){const s=Mn[i];if(s.getSelectors){var t=s.getSelectors();if(t){r.push(t)}}}const o=e.querySelectorAll(H+n+", form, [type='submit'],"+" [hx-ext], [data-hx-ext], [hx-trigger], [data-hx-trigger]"+r.flat().map(e=>", "+e).join(""));return o}else{return[]}}function Tt(e){const t=g(ue(e.target),"button, input[type='submit']");const n=Lt(e);if(n){n.lastButtonClicked=t}}function qt(e){const t=Lt(e);if(t){t.lastButtonClicked=null}}function Lt(e){const t=g(ue(e.target),"button, input[type='submit']");if(!t){return}const n=y("#"+ee(t,"form"),t.getRootNode())||g(t,"form");if(!n){return}return ie(n)}function At(e){e.addEventListener("click",Tt);e.addEventListener("focusin",Tt);e.addEventListener("focusout",qt)}function Nt(t,e,n){const r=ie(t);if(!Array.isArray(r.onHandlers)){r.onHandlers=[]}let o;const i=function(e){vn(t,function(){if(at(t)){return}if(!o){o=new Function("event",n)}o.call(t,e)})};t.addEventListener(e,i);r.onHandlers.push({event:e,listener:i})}function It(t){ke(t);for(let e=0;e<t.attributes.length;e++){const n=t.attributes[e].name;const r=t.attributes[e].value;if(l(n,"hx-on")||l(n,"data-hx-on")){const o=n.indexOf("-on")+3;const i=n.slice(o,o+1);if(i==="-"||i===":"){let e=n.slice(o+1);if(l(e,":")){e="htmx"+e}else if(l(e,"-")){e="htmx:"+e.slice(1)}else if(l(e,"htmx-")){e="htmx:"+e.slice(5)}Nt(t,e,r)}}}}function Pt(t){if(g(t,Q.config.disableSelector)){b(t);return}const n=ie(t);const e=Pe(t);if(n.initHash!==e){De(t);n.initHash=e;he(t,"htmx:beforeProcessNode");const r=st(t);const o=wt(t,n,r);if(!o){if(re(t,"hx-boost")==="true"){ft(t,n,r)}else if(s(t,"hx-trigger")){r.forEach(function(e){St(t,e,n,function(){})})}}if(t.tagName==="FORM"||ee(t,"type")==="submit"&&s(t,"form")){At(t)}n.firstInitCompleted=true;he(t,"htmx:afterProcessNode")}}function kt(e){e=y(e);if(g(e,Q.config.disableSelector)){b(e);return}Pt(e);se(Ht(e),function(e){Pt(e)});se(Rt(e),It)}function Dt(e){return e.replace(/([a-z0-9])([A-Z])/g,"$1-$2").toLowerCase()}function Mt(e,t){let n;if(window.CustomEvent&&typeof window.CustomEvent==="function"){n=new CustomEvent(e,{bubbles:true,cancelable:true,composed:true,detail:t})}else{n=ne().createEvent("CustomEvent");n.initCustomEvent(e,true,true,t)}return n}function fe(e,t,n){he(e,t,ce({error:t},n))}function Xt(e){return e==="htmx:afterProcessNode"}function Ft(e,t){se(Un(e),function(e){try{t(e)}catch(e){O(e)}})}function O(e){if(console.error){console.error(e)}else if(console.log){console.log("ERROR: ",e)}}function he(e,t,n){e=y(e);if(n==null){n={}}n.elt=e;const r=Mt(t,n);if(Q.logger&&!Xt(t)){Q.logger(e,t,n)}if(n.error){O(n.error);he(e,"htmx:error",{errorInfo:n})}let o=e.dispatchEvent(r);const i=Dt(t);if(o&&i!==t){const s=Mt(i,r.detail);o=o&&e.dispatchEvent(s)}Ft(ue(e),function(e){o=o&&(e.onEvent(t,r)!==false&&!r.defaultPrevented)});return o}let Bt=location.pathname+location.search;function Ut(){const e=ne().querySelector("[hx-history-elt],[data-hx-history-elt]");return e||ne().body}function jt(t,e){if(!B()){return}const n=_t(e);const r=ne().title;const o=window.scrollY;if(Q.config.historyCacheSize<=0){localStorage.removeItem("htmx-history-cache");return}t=U(t);const i=S(localStorage.getItem("htmx-history-cache"))||[];for(let e=0;e<i.length;e++){if(i[e].url===t){i.splice(e,1);break}}const s={url:t,content:n,title:r,scroll:o};he(ne().body,"htmx:historyItemCreated",{item:s,cache:i});i.push(s);while(i.length>Q.config.historyCacheSize){i.shift()}while(i.length>0){try{localStorage.setItem("htmx-history-cache",JSON.stringify(i));break}catch(e){fe(ne().body,"htmx:historyCacheError",{cause:e,cache:i});i.shift()}}}function Vt(t){if(!B()){return null}t=U(t);const n=S(localStorage.getItem("htmx-history-cache"))||[];for(let e=0;e<n.length;e++){if(n[e].url===t){return n[e]}}return null}function _t(e){const t=Q.config.requestClass;const n=e.cloneNode(true);se(x(n,"."+t),function(e){G(e,t)});se(x(n,"[data-disabled-by-htmx]"),function(e){e.removeAttribute("disabled")});return n.innerHTML}function zt(){const e=Ut();const t=Bt||location.pathname+location.search;let n;try{n=ne().querySelector('[hx-history="false" i],[data-hx-history="false" i]')}catch(e){n=ne().querySelector('[hx-history="false"],[data-hx-history="false"]')}if(!n){he(ne().body,"htmx:beforeHistorySave",{path:t,historyElt:e});jt(t,e)}if(Q.config.historyEnabled)history.replaceState({htmx:true},ne().title,window.location.href)}function $t(e){if(Q.config.getCacheBusterParam){e=e.replace(/org\.htmx\.cache-buster=[^&]*&?/,"");if(Y(e,"&")||Y(e,"?")){e=e.slice(0,-1)}}if(Q.config.historyEnabled){history.pushState({htmx:true},"",e)}Bt=e}function Jt(e){if(Q.config.historyEnabled)history.replaceState({htmx:true},"",e);Bt=e}function Kt(e){se(e,function(e){e.call(undefined)})}function Gt(o){const e=new XMLHttpRequest;const i={path:o,xhr:e};he(ne().body,"htmx:historyCacheMiss",i);e.open("GET",o,true);e.setRequestHeader("HX-Request","true");e.setRequestHeader("HX-History-Restore-Request","true");e.setRequestHeader("HX-Current-URL",ne().location.href);e.onload=function(){if(this.status>=200&&this.status<400){he(ne().body,"htmx:historyCacheMissLoad",i);const e=P(this.response);const t=e.querySelector("[hx-history-elt],[data-hx-history-elt]")||e;const n=Ut();const r=xn(n);kn(e.title);qe(e);Ve(n,t,r);Te();Kt(r.tasks);Bt=o;he(ne().body,"htmx:historyRestore",{path:o,cacheMiss:true,serverResponse:this.response})}else{fe(ne().body,"htmx:historyCacheMissLoadError",i)}};e.send()}function Wt(e){zt();e=e||location.pathname+location.search;const t=Vt(e);if(t){const n=P(t.content);const r=Ut();const o=xn(r);kn(t.title);qe(n);Ve(r,n,o);Te();Kt(o.tasks);E().setTimeout(function(){window.scrollTo(0,t.scroll)},0);Bt=e;he(ne().body,"htmx:historyRestore",{path:e,item:t})}else{if(Q.config.refreshOnHistoryMiss){window.location.reload(true)}else{Gt(e)}}}function Zt(e){let t=we(e,"hx-indicator");if(t==null){t=[e]}se(t,function(e){const t=ie(e);t.requestCount=(t.requestCount||0)+1;e.classList.add.call(e.classList,Q.config.requestClass)});return t}function Yt(e){let t=we(e,"hx-disabled-elt");if(t==null){t=[]}se(t,function(e){const t=ie(e);t.requestCount=(t.requestCount||0)+1;e.setAttribute("disabled","");e.setAttribute("data-disabled-by-htmx","")});return t}function Qt(e,t){se(e.concat(t),function(e){const t=ie(e);t.requestCount=(t.requestCount||1)-1});se(e,function(e){const t=ie(e);if(t.requestCount===0){e.classList.remove.call(e.classList,Q.config.requestClass)}});se(t,function(e){const t=ie(e);if(t.requestCount===0){e.removeAttribute("disabled");e.removeAttribute("data-disabled-by-htmx")}})}function en(t,n){for(let e=0;e<t.length;e++){const r=t[e];if(r.isSameNode(n)){return true}}return false}function tn(e){const t=e;if(t.name===""||t.name==null||t.disabled||g(t,"fieldset[disabled]")){return false}if(t.type==="button"||t.type==="submit"||t.tagName==="image"||t.tagName==="reset"||t.tagName==="file"){return false}if(t.type==="checkbox"||t.type==="radio"){return t.checked}return true}function nn(t,e,n){if(t!=null&&e!=null){if(Array.isArray(e)){e.forEach(function(e){n.append(t,e)})}else{n.append(t,e)}}}function rn(t,n,r){if(t!=null&&n!=null){let e=r.getAll(t);if(Array.isArray(n)){e=e.filter(e=>n.indexOf(e)<0)}else{e=e.filter(e=>e!==n)}r.delete(t);se(e,e=>r.append(t,e))}}function on(t,n,r,o,i){if(o==null||en(t,o)){return}else{t.push(o)}if(tn(o)){const s=ee(o,"name");let e=o.value;if(o instanceof HTMLSelectElement&&o.multiple){e=M(o.querySelectorAll("option:checked")).map(function(e){return e.value})}if(o instanceof HTMLInputElement&&o.files){e=M(o.files)}nn(s,e,n);if(i){sn(o,r)}}if(o instanceof HTMLFormElement){se(o.elements,function(e){if(t.indexOf(e)>=0){rn(e.name,e.value,n)}else{t.push(e)}if(i){sn(e,r)}});new FormData(o).forEach(function(e,t){if(e instanceof File&&e.name===""){return}nn(t,e,n)})}}function sn(e,t){const n=e;if(n.willValidate){he(n,"htmx:validation:validate");if(!n.checkValidity()){t.push({elt:n,message:n.validationMessage,validity:n.validity});he(n,"htmx:validation:failed",{message:n.validationMessage,validity:n.validity})}}}function ln(n,e){for(const t of e.keys()){n.delete(t)}e.forEach(function(e,t){n.append(t,e)});return n}function cn(e,t){const n=[];const r=new FormData;const o=new FormData;const i=[];const s=ie(e);if(s.lastButtonClicked&&!le(s.lastButtonClicked)){s.lastButtonClicked=null}let l=e instanceof HTMLFormElement&&e.noValidate!==true||te(e,"hx-validate")==="true";if(s.lastButtonClicked){l=l&&s.lastButtonClicked.formNoValidate!==true}if(t!=="get"){on(n,o,i,g(e,"form"),l)}on(n,r,i,e,l);if(s.lastButtonClicked||e.tagName==="BUTTON"||e.tagName==="INPUT"&&ee(e,"type")==="submit"){const u=s.lastButtonClicked||e;const a=ee(u,"name");nn(a,u.value,o)}const c=we(e,"hx-include");se(c,function(e){on(n,r,i,ue(e),l);if(!h(e,"form")){se(f(e).querySelectorAll(ot),function(e){on(n,r,i,e,l)})}});ln(r,o);return{errors:i,formData:r,values:An(r)}}function un(e,t,n){if(e!==""){e+="&"}if(String(n)==="[object Object]"){n=JSON.stringify(n)}const r=encodeURIComponent(n);e+=encodeURIComponent(t)+"="+r;return e}function an(e){e=qn(e);let n="";e.forEach(function(e,t){n=un(n,t,e)});return n}function fn(e,t,n){const r={"HX-Request":"true","HX-Trigger":ee(e,"id"),"HX-Trigger-Name":ee(e,"name"),"HX-Target":te(t,"id"),"HX-Current-URL":ne().location.href};bn(e,"hx-headers",false,r);if(n!==undefined){r["HX-Prompt"]=n}if(ie(e).boosted){r["HX-Boosted"]="true"}return r}function hn(n,e){const t=re(e,"hx-params");if(t){if(t==="none"){return new FormData}else if(t==="*"){return n}else if(t.indexOf("not ")===0){se(t.slice(4).split(","),function(e){e=e.trim();n.delete(e)});return n}else{const r=new FormData;se(t.split(","),function(t){t=t.trim();if(n.has(t)){n.getAll(t).forEach(function(e){r.append(t,e)})}});return r}}else{return n}}function dn(e){return!!ee(e,"href")&&ee(e,"href").indexOf("#")>=0}function gn(e,t){const n=t||re(e,"hx-swap");const r={swapStyle:ie(e).boosted?"innerHTML":Q.config.defaultSwapStyle,swapDelay:Q.config.defaultSwapDelay,settleDelay:Q.config.defaultSettleDelay};if(Q.config.scrollIntoViewOnBoost&&ie(e).boosted&&!dn(e)){r.show="top"}if(n){const s=F(n);if(s.length>0){for(let e=0;e<s.length;e++){const l=s[e];if(l.indexOf("swap:")===0){r.swapDelay=d(l.slice(5))}else if(l.indexOf("settle:")===0){r.settleDelay=d(l.slice(7))}else if(l.indexOf("transition:")===0){r.transition=l.slice(11)==="true"}else if(l.indexOf("ignoreTitle:")===0){r.ignoreTitle=l.slice(12)==="true"}else if(l.indexOf("scroll:")===0){const c=l.slice(7);var o=c.split(":");const u=o.pop();var i=o.length>0?o.join(":"):null;r.scroll=u;r.scrollTarget=i}else if(l.indexOf("show:")===0){const a=l.slice(5);var o=a.split(":");const f=o.pop();var i=o.length>0?o.join(":"):null;r.show=f;r.showTarget=i}else if(l.indexOf("focus-scroll:")===0){const h=l.slice("focus-scroll:".length);r.focusScroll=h=="true"}else if(e==0){r.swapStyle=l}else{O("Unknown modifier in hx-swap: "+l)}}}}return r}function pn(e){return re(e,"hx-encoding")==="multipart/form-data"||h(e,"form")&&ee(e,"enctype")==="multipart/form-data"}function mn(t,n,r){let o=null;Ft(n,function(e){if(o==null){o=e.encodeParameters(t,r,n)}});if(o!=null){return o}else{if(pn(n)){return ln(new FormData,qn(r))}else{return an(r)}}}function xn(e){return{tasks:[],elts:[e]}}function yn(e,t){const n=e[0];const r=e[e.length-1];if(t.scroll){var o=null;if(t.scrollTarget){o=ue(ae(n,t.scrollTarget))}if(t.scroll==="top"&&(n||o)){o=o||n;o.scrollTop=0}if(t.scroll==="bottom"&&(r||o)){o=o||r;o.scrollTop=o.scrollHeight}}if(t.show){var o=null;if(t.showTarget){let e=t.showTarget;if(t.showTarget==="window"){e="body"}o=ue(ae(n,e))}if(t.show==="top"&&(n||o)){o=o||n;o.scrollIntoView({block:"start",behavior:Q.config.scrollBehavior})}if(t.show==="bottom"&&(r||o)){o=o||r;o.scrollIntoView({block:"end",behavior:Q.config.scrollBehavior})}}}function bn(r,e,o,i){if(i==null){i={}}if(r==null){return i}const s=te(r,e);if(s){let e=s.trim();let t=o;if(e==="unset"){return null}if(e.indexOf("javascript:")===0){e=e.slice(11);t=true}else if(e.indexOf("js:")===0){e=e.slice(3);t=true}if(e.indexOf("{")!==0){e="{"+e+"}"}let n;if(t){n=vn(r,function(){return Function("return ("+e+")")()},{})}else{n=S(e)}for(const l in n){if(n.hasOwnProperty(l)){if(i[l]==null){i[l]=n[l]}}}}return bn(ue(c(r)),e,o,i)}function vn(e,t,n){if(Q.config.allowEval){return t()}else{fe(e,"htmx:evalDisallowedError");return n}}function wn(e,t){return bn(e,"hx-vars",true,t)}function Sn(e,t){return bn(e,"hx-vals",false,t)}function En(e){return ce(wn(e),Sn(e))}function Cn(t,n,r){if(r!==null){try{t.setRequestHeader(n,r)}catch(e){t.setRequestHeader(n,encodeURIComponent(r));t.setRequestHeader(n+"-URI-AutoEncoded","true")}}}function On(t){if(t.responseURL&&typeof URL!=="undefined"){try{const e=new URL(t.responseURL);return e.pathname+e.search}catch(e){fe(ne().body,"htmx:badResponseUrl",{url:t.responseURL})}}}function R(e,t){return t.test(e.getAllResponseHeaders())}function Rn(t,n,r){t=t.toLowerCase();if(r){if(r instanceof Element||typeof r==="string"){return de(t,n,null,null,{targetOverride:y(r)||ve,returnPromise:true})}else{let e=y(r.target);if(r.target&&!e||r.source&&!e&&!y(r.source)){e=ve}return de(t,n,y(r.source),r.event,{handler:r.handler,headers:r.headers,values:r.values,targetOverride:e,swapOverride:r.swap,select:r.select,returnPromise:true})}}else{return de(t,n,null,null,{returnPromise:true})}}function Hn(e){const t=[];while(e){t.push(e);e=e.parentElement}return t}function Tn(e,t,n){let r;let o;if(typeof URL==="function"){o=new URL(t,document.location.href);const i=document.location.origin;r=i===o.origin}else{o=t;r=l(t,document.location.origin)}if(Q.config.selfRequestsOnly){if(!r){return false}}return he(e,"htmx:validateUrl",ce({url:o,sameHost:r},n))}function qn(e){if(e instanceof FormData)return e;const t=new FormData;for(const n in e){if(e.hasOwnProperty(n)){if(e[n]&&typeof e[n].forEach==="function"){e[n].forEach(function(e){t.append(n,e)})}else if(typeof e[n]==="object"&&!(e[n]instanceof Blob)){t.append(n,JSON.stringify(e[n]))}else{t.append(n,e[n])}}}return t}function Ln(r,o,e){return new Proxy(e,{get:function(t,e){if(typeof e==="number")return t[e];if(e==="length")return t.length;if(e==="push"){return function(e){t.push(e);r.append(o,e)}}if(typeof t[e]==="function"){return function(){t[e].apply(t,arguments);r.delete(o);t.forEach(function(e){r.append(o,e)})}}if(t[e]&&t[e].length===1){return t[e][0]}else{return t[e]}},set:function(e,t,n){e[t]=n;r.delete(o);e.forEach(function(e){r.append(o,e)});return true}})}function An(o){return new Proxy(o,{get:function(e,t){if(typeof t==="symbol"){const r=Reflect.get(e,t);if(typeof r==="function"){return function(){return r.apply(o,arguments)}}else{return r}}if(t==="toJSON"){return()=>Object.fromEntries(o)}if(t in e){if(typeof e[t]==="function"){return function(){return o[t].apply(o,arguments)}}else{return e[t]}}const n=o.getAll(t);if(n.length===0){return undefined}else if(n.length===1){return n[0]}else{return Ln(e,t,n)}},set:function(t,n,e){if(typeof n!=="string"){return false}t.delete(n);if(e&&typeof e.forEach==="function"){e.forEach(function(e){t.append(n,e)})}else if(typeof e==="object"&&!(e instanceof Blob)){t.append(n,JSON.stringify(e))}else{t.append(n,e)}return true},deleteProperty:function(e,t){if(typeof t==="string"){e.delete(t)}return true},ownKeys:function(e){return Reflect.ownKeys(Object.fromEntries(e))},getOwnPropertyDescriptor:function(e,t){return Reflect.getOwnPropertyDescriptor(Object.fromEntries(e),t)}})}function de(t,n,r,o,i,D){let s=null;let l=null;i=i!=null?i:{};if(i.returnPromise&&typeof Promise!=="undefined"){var e=new Promise(function(e,t){s=e;l=t})}if(r==null){r=ne().body}const M=i.handler||Dn;const X=i.select||null;if(!le(r)){oe(s);return e}const c=i.targetOverride||ue(Ee(r));if(c==null||c==ve){fe(r,"htmx:targetError",{target:te(r,"hx-target")});oe(l);return e}let u=ie(r);const a=u.lastButtonClicked;if(a){const L=ee(a,"formaction");if(L!=null){n=L}const A=ee(a,"formmethod");if(A!=null){if(A.toLowerCase()!=="dialog"){t=A}}}const f=re(r,"hx-confirm");if(D===undefined){const K=function(e){return de(t,n,r,o,i,!!e)};const G={target:c,elt:r,path:n,verb:t,triggeringEvent:o,etc:i,issueRequest:K,question:f};if(he(r,"htmx:confirm",G)===false){oe(s);return e}}let h=r;let d=re(r,"hx-sync");let g=null;let F=false;if(d){const N=d.split(":");const I=N[0].trim();if(I==="this"){h=Se(r,"hx-sync")}else{h=ue(ae(r,I))}d=(N[1]||"drop").trim();u=ie(h);if(d==="drop"&&u.xhr&&u.abortable!==true){oe(s);return e}else if(d==="abort"){if(u.xhr){oe(s);return e}else{F=true}}else if(d==="replace"){he(h,"htmx:abort")}else if(d.indexOf("queue")===0){const W=d.split(" ");g=(W[1]||"last").trim()}}if(u.xhr){if(u.abortable){he(h,"htmx:abort")}else{if(g==null){if(o){const P=ie(o);if(P&&P.triggerSpec&&P.triggerSpec.queue){g=P.triggerSpec.queue}}if(g==null){g="last"}}if(u.queuedRequests==null){u.queuedRequests=[]}if(g==="first"&&u.queuedRequests.length===0){u.queuedRequests.push(function(){de(t,n,r,o,i)})}else if(g==="all"){u.queuedRequests.push(function(){de(t,n,r,o,i)})}else if(g==="last"){u.queuedRequests=[];u.queuedRequests.push(function(){de(t,n,r,o,i)})}oe(s);return e}}const p=new XMLHttpRequest;u.xhr=p;u.abortable=F;const m=function(){u.xhr=null;u.abortable=false;if(u.queuedRequests!=null&&u.queuedRequests.length>0){const e=u.queuedRequests.shift();e()}};const B=re(r,"hx-prompt");if(B){var x=prompt(B);if(x===null||!he(r,"htmx:prompt",{prompt:x,target:c})){oe(s);m();return e}}if(f&&!D){if(!confirm(f)){oe(s);m();return e}}let y=fn(r,c,x);if(t!=="get"&&!pn(r)){y["Content-Type"]="application/x-www-form-urlencoded"}if(i.headers){y=ce(y,i.headers)}const U=cn(r,t);let b=U.errors;const j=U.formData;if(i.values){ln(j,qn(i.values))}const V=qn(En(r));const v=ln(j,V);let w=hn(v,r);if(Q.config.getCacheBusterParam&&t==="get"){w.set("org.htmx.cache-buster",ee(c,"id")||"true")}if(n==null||n===""){n=ne().location.href}const S=bn(r,"hx-request");const _=ie(r).boosted;let E=Q.config.methodsThatUseUrlParams.indexOf(t)>=0;const C={boosted:_,useUrlParams:E,formData:w,parameters:An(w),unfilteredFormData:v,unfilteredParameters:An(v),headers:y,target:c,verb:t,errors:b,withCredentials:i.credentials||S.credentials||Q.config.withCredentials,timeout:i.timeout||S.timeout||Q.config.timeout,path:n,triggeringEvent:o};if(!he(r,"htmx:configRequest",C)){oe(s);m();return e}n=C.path;t=C.verb;y=C.headers;w=qn(C.parameters);b=C.errors;E=C.useUrlParams;if(b&&b.length>0){he(r,"htmx:validation:halted",C);oe(s);m();return e}const z=n.split("#");const $=z[0];const O=z[1];let R=n;if(E){R=$;const Z=!w.keys().next().done;if(Z){if(R.indexOf("?")<0){R+="?"}else{R+="&"}R+=an(w);if(O){R+="#"+O}}}if(!Tn(r,R,C)){fe(r,"htmx:invalidPath",C);oe(l);return e}p.open(t.toUpperCase(),R,true);p.overrideMimeType("text/html");p.withCredentials=C.withCredentials;p.timeout=C.timeout;if(S.noHeaders){}else{for(const k in y){if(y.hasOwnProperty(k)){const Y=y[k];Cn(p,k,Y)}}}const H={xhr:p,target:c,requestConfig:C,etc:i,boosted:_,select:X,pathInfo:{requestPath:n,finalRequestPath:R,responsePath:null,anchor:O}};p.onload=function(){try{const t=Hn(r);H.pathInfo.responsePath=On(p);M(r,H);if(H.keepIndicators!==true){Qt(T,q)}he(r,"htmx:afterRequest",H);he(r,"htmx:afterOnLoad",H);if(!le(r)){let e=null;while(t.length>0&&e==null){const n=t.shift();if(le(n)){e=n}}if(e){he(e,"htmx:afterRequest",H);he(e,"htmx:afterOnLoad",H)}}oe(s);m()}catch(e){fe(r,"htmx:onLoadError",ce({error:e},H));throw e}};p.onerror=function(){Qt(T,q);fe(r,"htmx:afterRequest",H);fe(r,"htmx:sendError",H);oe(l);m()};p.onabort=function(){Qt(T,q);fe(r,"htmx:afterRequest",H);fe(r,"htmx:sendAbort",H);oe(l);m()};p.ontimeout=function(){Qt(T,q);fe(r,"htmx:afterRequest",H);fe(r,"htmx:timeout",H);oe(l);m()};if(!he(r,"htmx:beforeRequest",H)){oe(s);m();return e}var T=Zt(r);var q=Yt(r);se(["loadstart","loadend","progress","abort"],function(t){se([p,p.upload],function(e){e.addEventListener(t,function(e){he(r,"htmx:xhr:"+t,{lengthComputable:e.lengthComputable,loaded:e.loaded,total:e.total})})})});he(r,"htmx:beforeSend",H);const J=E?null:mn(p,r,w);p.send(J);return e}function Nn(e,t){const n=t.xhr;let r=null;let o=null;if(R(n,/HX-Push:/i)){r=n.getResponseHeader("HX-Push");o="push"}else if(R(n,/HX-Push-Url:/i)){r=n.getResponseHeader("HX-Push-Url");o="push"}else if(R(n,/HX-Replace-Url:/i)){r=n.getResponseHeader("HX-Replace-Url");o="replace"}if(r){if(r==="false"){return{}}else{return{type:o,path:r}}}const i=t.pathInfo.finalRequestPath;const s=t.pathInfo.responsePath;const l=re(e,"hx-push-url");const c=re(e,"hx-replace-url");const u=ie(e).boosted;let a=null;let f=null;if(l){a="push";f=l}else if(c){a="replace";f=c}else if(u){a="push";f=s||i}if(f){if(f==="false"){return{}}if(f==="true"){f=s||i}if(t.pathInfo.anchor&&f.indexOf("#")===-1){f=f+"#"+t.pathInfo.anchor}return{type:a,path:f}}else{return{}}}function In(e,t){var n=new RegExp(e.code);return n.test(t.toString(10))}function Pn(e){for(var t=0;t<Q.config.responseHandling.length;t++){var n=Q.config.responseHandling[t];if(In(n,e.status)){return n}}return{swap:false}}function kn(e){if(e){const t=u("title");if(t){t.innerHTML=e}else{window.document.title=e}}}function Dn(o,i){const s=i.xhr;let l=i.target;const e=i.etc;const c=i.select;if(!he(o,"htmx:beforeOnLoad",i))return;if(R(s,/HX-Trigger:/i)){Je(s,"HX-Trigger",o)}if(R(s,/HX-Location:/i)){zt();let e=s.getResponseHeader("HX-Location");var t;if(e.indexOf("{")===0){t=S(e);e=t.path;delete t.path}Rn("get",e,t).then(function(){$t(e)});return}const n=R(s,/HX-Refresh:/i)&&s.getResponseHeader("HX-Refresh")==="true";if(R(s,/HX-Redirect:/i)){i.keepIndicators=true;location.href=s.getResponseHeader("HX-Redirect");n&&location.reload();return}if(n){i.keepIndicators=true;location.reload();return}if(R(s,/HX-Retarget:/i)){if(s.getResponseHeader("HX-Retarget")==="this"){i.target=o}else{i.target=ue(ae(o,s.getResponseHeader("HX-Retarget")))}}const u=Nn(o,i);const r=Pn(s);const a=r.swap;let f=!!r.error;let h=Q.config.ignoreTitle||r.ignoreTitle;let d=r.select;if(r.target){i.target=ue(ae(o,r.target))}var g=e.swapOverride;if(g==null&&r.swapOverride){g=r.swapOverride}if(R(s,/HX-Retarget:/i)){if(s.getResponseHeader("HX-Retarget")==="this"){i.target=o}else{i.target=ue(ae(o,s.getResponseHeader("HX-Retarget")))}}if(R(s,/HX-Reswap:/i)){g=s.getResponseHeader("HX-Reswap")}var p=s.response;var m=ce({shouldSwap:a,serverResponse:p,isError:f,ignoreTitle:h,selectOverride:d,swapOverride:g},i);if(r.event&&!he(l,r.event,m))return;if(!he(l,"htmx:beforeSwap",m))return;l=m.target;p=m.serverResponse;f=m.isError;h=m.ignoreTitle;d=m.selectOverride;g=m.swapOverride;i.target=l;i.failed=f;i.successful=!f;if(m.shouldSwap){if(s.status===286){lt(o)}Ft(o,function(e){p=e.transformResponse(p,s,o)});if(u.type){zt()}var x=gn(o,g);if(!x.hasOwnProperty("ignoreTitle")){x.ignoreTitle=h}l.classList.add(Q.config.swappingClass);let n=null;let r=null;if(c){d=c}if(R(s,/HX-Reselect:/i)){d=s.getResponseHeader("HX-Reselect")}const y=re(o,"hx-select-oob");const b=re(o,"hx-select");let e=function(){try{if(u.type){he(ne().body,"htmx:beforeHistoryUpdate",ce({history:u},i));if(u.type==="push"){$t(u.path);he(ne().body,"htmx:pushedIntoHistory",{path:u.path})}else{Jt(u.path);he(ne().body,"htmx:replacedInHistory",{path:u.path})}}$e(l,p,x,{select:d||b,selectOOB:y,eventInfo:i,anchor:i.pathInfo.anchor,contextElement:o,afterSwapCallback:function(){if(R(s,/HX-Trigger-After-Swap:/i)){let e=o;if(!le(o)){e=ne().body}Je(s,"HX-Trigger-After-Swap",e)}},afterSettleCallback:function(){if(R(s,/HX-Trigger-After-Settle:/i)){let e=o;if(!le(o)){e=ne().body}Je(s,"HX-Trigger-After-Settle",e)}oe(n)}})}catch(e){fe(o,"htmx:swapError",i);oe(r);throw e}};let t=Q.config.globalViewTransitions;if(x.hasOwnProperty("transition")){t=x.transition}if(t&&he(o,"htmx:beforeTransition",i)&&typeof Promise!=="undefined"&&document.startViewTransition){const v=new Promise(function(e,t){n=e;r=t});const w=e;e=function(){document.startViewTransition(function(){w();return v})}}if(x.swapDelay>0){E().setTimeout(e,x.swapDelay)}else{e()}}if(f){fe(o,"htmx:responseError",ce({error:"Response Status Error Code "+s.status+" from "+i.pathInfo.requestPath},i))}}const Mn={};function Xn(){return{init:function(e){return null},getSelectors:function(){return null},onEvent:function(e,t){return true},transformResponse:function(e,t,n){return e},isInlineSwap:function(e){return false},handleSwap:function(e,t,n,r){return false},encodeParameters:function(e,t,n){return null}}}function Fn(e,t){if(t.init){t.init(n)}Mn[e]=ce(Xn(),t)}function Bn(e){delete Mn[e]}function Un(e,n,r){if(n==undefined){n=[]}if(e==undefined){return n}if(r==undefined){r=[]}const t=te(e,"hx-ext");if(t){se(t.split(","),function(e){e=e.replace(/ /g,"");if(e.slice(0,7)=="ignore:"){r.push(e.slice(7));return}if(r.indexOf(e)<0){const t=Mn[e];if(t&&n.indexOf(t)<0){n.push(t)}}})}return Un(ue(c(e)),n,r)}var jn=false;ne().addEventListener("DOMContentLoaded",function(){jn=true});function Vn(e){if(jn||ne().readyState==="complete"){e()}else{ne().addEventListener("DOMContentLoaded",e)}}function _n(){if(Q.config.includeIndicatorStyles!==false){const e=Q.config.inlineStyleNonce?` nonce="${Q.config.inlineStyleNonce}"`:"";ne().head.insertAdjacentHTML("beforeend","<style"+e+"> ."+Q.config.indicatorClass+"{opacity:0} ."+Q.config.requestClass+" ."+Q.config.indicatorClass+"{opacity:1; transition: opacity 200ms ease-in;} ."+Q.config.requestClass+"."+Q.config.indicatorClass+"{opacity:1; transition: opacity 200ms ease-in;} </style>")}}function zn(){const e=ne().querySelector('meta[name="htmx-config"]');if(e){return S(e.content)}else{return null}}function $n(){const e=zn();if(e){Q.config=ce(Q.config,e)}}Vn(function(){$n();_n();let e=ne().body;kt(e);const t=ne().querySelectorAll("[hx-trigger='restored'],[data-hx-trigger='restored']");e.addEventListener("htmx:abort",function(e){const t=e.target;const n=ie(t);if(n&&n.xhr){n.xhr.abort()}});const n=window.onpopstate?window.onpopstate.bind(window):null;window.onpopstate=function(e){if(e.state&&e.state.htmx){Wt();se(t,function(e){he(e,"htmx:restored",{document:ne(),triggerEvent:he})})}else{if(n){n(e)}}};E().setTimeout(function(){he(e,"htmx:load",{});e=null},0)});return Q}();
src/web/static/manifest.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "Research Intelligence",
3
+ "short_name": "Research",
4
+ "description": "AI/ML and Security paper triage dashboard",
5
+ "start_url": "/",
6
+ "scope": "/",
7
+ "display": "standalone",
8
+ "background_color": "#060a13",
9
+ "theme_color": "#0b1121",
10
+ "orientation": "any",
11
+ "categories": ["productivity", "utilities"],
12
+ "icons": [
13
+ {
14
+ "src": "/static/favicon.svg",
15
+ "type": "image/svg+xml",
16
+ "sizes": "any",
17
+ "purpose": "any"
18
+ },
19
+ {
20
+ "src": "/static/favicon-192.png",
21
+ "type": "image/png",
22
+ "sizes": "192x192",
23
+ "purpose": "any"
24
+ },
25
+ {
26
+ "src": "/static/favicon-512.png",
27
+ "type": "image/png",
28
+ "sizes": "512x512",
29
+ "purpose": "any"
30
+ }
31
+ ]
32
+ }
src/web/static/style.css ADDED
@@ -0,0 +1,1701 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* ============================================
2
+ Research Intelligence — Observatory Theme
3
+
4
+ Deep navy dark theme with luminous score
5
+ indicators and editorial typography.
6
+ ============================================ */
7
+
8
+ /* ─── Custom Properties ─── */
9
+ :root {
10
+ --bg-deep: #060a13;
11
+ --bg: #0b1121;
12
+ --bg-card: #0f172a;
13
+ --bg-surface: #1e293b;
14
+ --bg-hover: #172554;
15
+
16
+ --border: rgba(148, 163, 184, 0.08);
17
+ --border-strong: rgba(148, 163, 184, 0.15);
18
+ --border-accent: rgba(59, 130, 246, 0.4);
19
+
20
+ --text: #f1f5f9;
21
+ --text-secondary: #cbd5e1;
22
+ --text-muted: #64748b;
23
+ --text-dim: #334155;
24
+
25
+ --accent: #3b82f6;
26
+ --accent-hover: #60a5fa;
27
+ --accent-muted: rgba(59, 130, 246, 0.12);
28
+
29
+ --emerald: #10b981;
30
+ --emerald-glow: rgba(16, 185, 129, 0.2);
31
+ --amber: #f59e0b;
32
+ --amber-glow: rgba(245, 158, 11, 0.2);
33
+ --red: #ef4444;
34
+ --red-glow: rgba(239, 68, 68, 0.2);
35
+ --purple: #a78bfa;
36
+ --purple-glow: rgba(167, 139, 250, 0.15);
37
+
38
+ --font-display: system-ui, -apple-system, 'Segoe UI', sans-serif;
39
+ --font-body: system-ui, -apple-system, 'Segoe UI', Roboto, sans-serif;
40
+ --font-mono: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
41
+
42
+ --radius: 8px;
43
+ --radius-lg: 12px;
44
+ --radius-xl: 16px;
45
+ --radius-full: 9999px;
46
+
47
+ --shadow-sm: 0 1px 3px rgba(0, 0, 0, 0.4);
48
+ --shadow-md: 0 4px 16px rgba(0, 0, 0, 0.4);
49
+ --shadow-lg: 0 8px 32px rgba(0, 0, 0, 0.5);
50
+ --shadow-glow: 0 0 20px rgba(59, 130, 246, 0.1);
51
+
52
+ --nav-height: 56px;
53
+ }
54
+
55
+ /* ─── Reset ─── */
56
+ *, *::before, *::after {
57
+ box-sizing: border-box;
58
+ margin: 0;
59
+ padding: 0;
60
+ }
61
+
62
+ * {
63
+ scrollbar-width: thin;
64
+ scrollbar-color: var(--bg-surface) transparent;
65
+ }
66
+
67
+ ::-webkit-scrollbar { width: 6px; height: 6px; }
68
+ ::-webkit-scrollbar-track { background: transparent; }
69
+ ::-webkit-scrollbar-thumb { background: var(--bg-surface); border-radius: 3px; }
70
+ ::-webkit-scrollbar-thumb:hover { background: var(--text-dim); }
71
+
72
+ ::selection {
73
+ background: rgba(59, 130, 246, 0.3);
74
+ color: var(--text);
75
+ }
76
+
77
+ /* ─── Base ─── */
78
+ body {
79
+ font-family: var(--font-body);
80
+ background: var(--bg-deep);
81
+ background-image:
82
+ radial-gradient(ellipse 80% 60% at 15% -10%, rgba(59, 130, 246, 0.08) 0%, transparent 60%),
83
+ radial-gradient(ellipse 60% 50% at 85% 110%, rgba(16, 185, 129, 0.04) 0%, transparent 60%);
84
+ color: var(--text);
85
+ line-height: 1.6;
86
+ min-height: 100vh;
87
+ font-size: 15px;
88
+ -webkit-font-smoothing: antialiased;
89
+ -moz-osx-font-smoothing: grayscale;
90
+ }
91
+
92
+ a {
93
+ color: var(--accent);
94
+ text-decoration: none;
95
+ transition: color 0.15s;
96
+ }
97
+
98
+ a:hover {
99
+ color: var(--accent-hover);
100
+ }
101
+
102
+ /* ─── Page Loader (HTMX) ─── */
103
+ .page-loader {
104
+ position: fixed;
105
+ top: 0;
106
+ left: 0;
107
+ width: 100%;
108
+ height: 2px;
109
+ z-index: 10000;
110
+ overflow: hidden;
111
+ opacity: 0;
112
+ transition: opacity 0.15s;
113
+ }
114
+
115
+ .page-loader::after {
116
+ content: '';
117
+ position: absolute;
118
+ inset: 0;
119
+ background: linear-gradient(90deg, transparent, var(--accent), var(--accent-hover), transparent);
120
+ transform: translateX(-100%);
121
+ }
122
+
123
+ .htmx-request.page-loader,
124
+ .htmx-request .page-loader {
125
+ opacity: 1;
126
+ }
127
+
128
+ .htmx-request.page-loader::after,
129
+ .htmx-request .page-loader::after {
130
+ animation: loadSlide 1.2s ease-in-out infinite;
131
+ }
132
+
133
+ /* ─── Navigation ─── */
134
+ nav {
135
+ background: rgba(11, 17, 33, 0.82);
136
+ backdrop-filter: blur(24px) saturate(180%);
137
+ -webkit-backdrop-filter: blur(24px) saturate(180%);
138
+ border-bottom: 1px solid var(--border);
139
+ padding: 0 2rem;
140
+ display: flex;
141
+ align-items: center;
142
+ gap: 2.5rem;
143
+ position: sticky;
144
+ top: 0;
145
+ z-index: 1000;
146
+ height: var(--nav-height);
147
+ }
148
+
149
+ .logo {
150
+ font-family: var(--font-display);
151
+ font-size: 1.15rem;
152
+ font-weight: 700;
153
+ color: var(--text);
154
+ white-space: nowrap;
155
+ display: flex;
156
+ align-items: center;
157
+ gap: 8px;
158
+ letter-spacing: -0.02em;
159
+ }
160
+
161
+ .logo-dot {
162
+ width: 7px;
163
+ height: 7px;
164
+ background: var(--accent);
165
+ border-radius: 50%;
166
+ flex-shrink: 0;
167
+ box-shadow: 0 0 6px var(--accent), 0 0 14px rgba(59, 130, 246, 0.3);
168
+ animation: pulse 3s ease-in-out infinite;
169
+ }
170
+
171
+ .nav-links {
172
+ display: flex;
173
+ gap: 0.25rem;
174
+ align-items: center;
175
+ }
176
+
177
+ .nav-links a {
178
+ color: var(--text-muted);
179
+ font-size: 0.85rem;
180
+ font-weight: 500;
181
+ padding: 0.35rem 0.75rem;
182
+ border-radius: var(--radius);
183
+ transition: color 0.15s, background 0.15s;
184
+ position: relative;
185
+ }
186
+
187
+ .nav-links a:hover {
188
+ color: var(--text-secondary);
189
+ background: var(--accent-muted);
190
+ text-decoration: none;
191
+ }
192
+
193
+ .nav-links a.active {
194
+ color: var(--text);
195
+ background: var(--accent-muted);
196
+ }
197
+
198
+ .nav-links a.active::after {
199
+ content: '';
200
+ position: absolute;
201
+ bottom: -1px;
202
+ left: 0.75rem;
203
+ right: 0.75rem;
204
+ height: 2px;
205
+ background: var(--accent);
206
+ border-radius: 1px 1px 0 0;
207
+ }
208
+
209
+ /* ─── Layout ─── */
210
+ .container {
211
+ max-width: 1280px;
212
+ margin: 0 auto;
213
+ padding: 2rem;
214
+ }
215
+
216
+ .page-header {
217
+ margin-bottom: 2rem;
218
+ }
219
+
220
+ .page-header h1 {
221
+ font-family: var(--font-display);
222
+ font-size: 1.75rem;
223
+ font-weight: 700;
224
+ letter-spacing: -0.03em;
225
+ line-height: 1.25;
226
+ }
227
+
228
+ .page-header .subtitle {
229
+ color: var(--text-muted);
230
+ font-size: 0.875rem;
231
+ margin-top: 0.35rem;
232
+ }
233
+
234
+ /* ─── Stats Grid (Dashboard) ─── */
235
+ .stats-grid {
236
+ display: grid;
237
+ grid-template-columns: repeat(4, 1fr);
238
+ gap: 1rem;
239
+ margin-bottom: 2.5rem;
240
+ }
241
+
242
+ .stat-card {
243
+ border: 1px solid var(--border);
244
+ border-radius: var(--radius-lg);
245
+ padding: 1.25rem 1.5rem;
246
+ position: relative;
247
+ overflow: hidden;
248
+ animation: fadeSlideUp 0.45s ease-out both;
249
+ }
250
+
251
+ .stat-card::before {
252
+ content: '';
253
+ position: absolute;
254
+ inset: 0;
255
+ opacity: 0.5;
256
+ border-radius: inherit;
257
+ pointer-events: none;
258
+ }
259
+
260
+ .stat-card--blue { background: linear-gradient(145deg, rgba(59,130,246,0.08) 0%, var(--bg-card) 70%); }
261
+ .stat-card--red { background: linear-gradient(145deg, rgba(239,68,68,0.08) 0%, var(--bg-card) 70%); }
262
+ .stat-card--purple { background: linear-gradient(145deg, rgba(167,139,250,0.08) 0%, var(--bg-card) 70%); }
263
+ .stat-card--green { background: linear-gradient(145deg, rgba(16,185,129,0.08) 0%, var(--bg-card) 70%); }
264
+
265
+ .stat-card:nth-child(1) { animation-delay: 0s; }
266
+ .stat-card:nth-child(2) { animation-delay: 0.06s; }
267
+ .stat-card:nth-child(3) { animation-delay: 0.12s; }
268
+ .stat-card:nth-child(4) { animation-delay: 0.18s; }
269
+
270
+ .stat-card .label {
271
+ color: var(--text-muted);
272
+ font-size: 0.75rem;
273
+ font-weight: 600;
274
+ text-transform: uppercase;
275
+ letter-spacing: 0.06em;
276
+ }
277
+
278
+ .stat-card .value {
279
+ font-family: var(--font-mono);
280
+ font-size: 1.75rem;
281
+ font-weight: 700;
282
+ margin-top: 0.5rem;
283
+ letter-spacing: -0.02em;
284
+ }
285
+
286
+ .stat-card .value--small {
287
+ font-size: 0.95rem;
288
+ font-weight: 500;
289
+ color: var(--text-secondary);
290
+ }
291
+
292
+ /* ─── Section Headers ─── */
293
+ .section-header {
294
+ display: flex;
295
+ align-items: center;
296
+ gap: 0.75rem;
297
+ margin-bottom: 1rem;
298
+ padding-bottom: 0.75rem;
299
+ border-bottom: 1px solid var(--border);
300
+ }
301
+
302
+ .section-header h2 {
303
+ font-family: var(--font-display);
304
+ font-size: 1.15rem;
305
+ font-weight: 600;
306
+ letter-spacing: -0.02em;
307
+ }
308
+
309
+ .section-title {
310
+ font-family: var(--font-display);
311
+ font-size: 1.05rem;
312
+ font-weight: 600;
313
+ letter-spacing: -0.01em;
314
+ margin-bottom: 0.75rem;
315
+ display: flex;
316
+ align-items: center;
317
+ gap: 0.5rem;
318
+ }
319
+
320
+ /* ─── Badges ─── */
321
+ .badge {
322
+ display: inline-flex;
323
+ align-items: center;
324
+ font-size: 0.65rem;
325
+ font-weight: 700;
326
+ padding: 0.15rem 0.55rem;
327
+ border-radius: var(--radius-full);
328
+ letter-spacing: 0.04em;
329
+ text-transform: uppercase;
330
+ white-space: nowrap;
331
+ }
332
+
333
+ .badge--accent { background: var(--accent-muted); color: var(--accent-hover); }
334
+ .badge--red { background: var(--red-glow); color: var(--red); }
335
+ .badge--emerald { background: var(--emerald-glow); color: var(--emerald); }
336
+ .badge--amber { background: var(--amber-glow); color: var(--amber); }
337
+ .badge--purple { background: var(--purple-glow); color: var(--purple); }
338
+
339
+ .badge-code {
340
+ font-size: 0.65rem;
341
+ font-weight: 700;
342
+ padding: 0.12rem 0.45rem;
343
+ border-radius: var(--radius-full);
344
+ background: var(--emerald-glow);
345
+ color: var(--emerald);
346
+ letter-spacing: 0.03em;
347
+ }
348
+
349
+ .badge-hf {
350
+ font-size: 0.65rem;
351
+ font-weight: 700;
352
+ padding: 0.12rem 0.45rem;
353
+ border-radius: var(--radius-full);
354
+ background: var(--amber-glow);
355
+ color: var(--amber);
356
+ letter-spacing: 0.03em;
357
+ }
358
+
359
+ .badge-source {
360
+ font-size: 0.65rem;
361
+ font-weight: 700;
362
+ padding: 0.12rem 0.45rem;
363
+ border-radius: var(--radius-full);
364
+ background: var(--accent-muted);
365
+ color: var(--accent);
366
+ letter-spacing: 0.03em;
367
+ }
368
+
369
+ /* ─── Two-Column Grid ─── */
370
+ .two-col {
371
+ display: grid;
372
+ grid-template-columns: 1fr 1fr;
373
+ gap: 2rem;
374
+ margin-bottom: 2rem;
375
+ }
376
+
377
+ .three-col {
378
+ display: grid;
379
+ grid-template-columns: repeat(3, 1fr);
380
+ gap: 1.5rem;
381
+ margin-bottom: 2rem;
382
+ }
383
+
384
+ .events-auto-grid {
385
+ display: grid;
386
+ grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
387
+ gap: 1.5rem;
388
+ margin-bottom: 2rem;
389
+ }
390
+
391
+ /* ─── Paper Cards ─── */
392
+ .paper-card {
393
+ background: var(--bg-card);
394
+ border: 1px solid var(--border);
395
+ border-radius: var(--radius-lg);
396
+ padding: 1rem 1.25rem;
397
+ margin-bottom: 0.625rem;
398
+ transition: border-color 0.2s, box-shadow 0.2s, transform 0.2s;
399
+ animation: fadeSlideUp 0.4s ease-out both;
400
+ }
401
+
402
+ .paper-card:nth-child(1) { animation-delay: 0.05s; }
403
+ .paper-card:nth-child(2) { animation-delay: 0.1s; }
404
+ .paper-card:nth-child(3) { animation-delay: 0.15s; }
405
+ .paper-card:nth-child(4) { animation-delay: 0.2s; }
406
+ .paper-card:nth-child(5) { animation-delay: 0.25s; }
407
+
408
+ .paper-card:hover {
409
+ border-color: var(--border-accent);
410
+ box-shadow: var(--shadow-glow);
411
+ transform: translateY(-1px);
412
+ }
413
+
414
+ .paper-card .card-top {
415
+ display: flex;
416
+ justify-content: space-between;
417
+ align-items: flex-start;
418
+ gap: 1rem;
419
+ }
420
+
421
+ .paper-card .rank {
422
+ color: var(--text-dim);
423
+ font-family: var(--font-mono);
424
+ font-size: 0.75rem;
425
+ font-weight: 600;
426
+ }
427
+
428
+ .paper-card .title {
429
+ font-weight: 600;
430
+ font-size: 0.9rem;
431
+ margin: 0.2rem 0 0.35rem;
432
+ line-height: 1.45;
433
+ }
434
+
435
+ .paper-card .title a {
436
+ color: var(--text);
437
+ transition: color 0.15s;
438
+ }
439
+
440
+ .paper-card .title a:hover {
441
+ color: var(--accent-hover);
442
+ }
443
+
444
+ .paper-card .meta {
445
+ display: flex;
446
+ align-items: center;
447
+ gap: 0.5rem;
448
+ font-size: 0.75rem;
449
+ color: var(--text-muted);
450
+ flex-wrap: wrap;
451
+ }
452
+
453
+ .paper-card .score-badge {
454
+ font-family: var(--font-mono);
455
+ font-size: 1.1rem;
456
+ font-weight: 700;
457
+ min-width: 2.5rem;
458
+ text-align: right;
459
+ line-height: 1;
460
+ }
461
+
462
+ .paper-card .score-badge.high { color: var(--emerald); text-shadow: 0 0 12px var(--emerald-glow); }
463
+ .paper-card .score-badge.mid { color: var(--amber); text-shadow: 0 0 12px var(--amber-glow); }
464
+ .paper-card .score-badge.low { color: var(--red); text-shadow: 0 0 12px var(--red-glow); }
465
+
466
+ .paper-card .summary-text {
467
+ font-size: 0.82rem;
468
+ color: var(--text-muted);
469
+ margin-top: 0.5rem;
470
+ line-height: 1.55;
471
+ display: -webkit-box;
472
+ -webkit-line-clamp: 3;
473
+ -webkit-box-orient: vertical;
474
+ overflow: hidden;
475
+ }
476
+
477
+ .paper-card .score-mini-track {
478
+ height: 3px;
479
+ background: var(--bg-deep);
480
+ border-radius: 2px;
481
+ margin-top: 0.75rem;
482
+ overflow: hidden;
483
+ }
484
+
485
+ .paper-card .score-mini-fill {
486
+ height: 100%;
487
+ border-radius: 2px;
488
+ transition: width 0.6s cubic-bezier(0.16, 1, 0.3, 1);
489
+ }
490
+
491
+ .paper-card .score-mini-fill.high { background: linear-gradient(90deg, #059669, #10b981); }
492
+ .paper-card .score-mini-fill.mid { background: linear-gradient(90deg, #d97706, #f59e0b); }
493
+ .paper-card .score-mini-fill.low { background: linear-gradient(90deg, #dc2626, #ef4444); }
494
+
495
+ /* ─── Paper Table ─── */
496
+ .paper-table {
497
+ width: 100%;
498
+ border-collapse: separate;
499
+ border-spacing: 0;
500
+ font-size: 0.84rem;
501
+ }
502
+
503
+ .paper-table thead th {
504
+ text-align: left;
505
+ padding: 0.625rem 0.75rem;
506
+ color: var(--text-muted);
507
+ font-weight: 600;
508
+ font-size: 0.7rem;
509
+ text-transform: uppercase;
510
+ letter-spacing: 0.06em;
511
+ border-bottom: 1px solid var(--border-strong);
512
+ white-space: nowrap;
513
+ position: sticky;
514
+ top: var(--nav-height);
515
+ background: var(--bg-deep);
516
+ z-index: 10;
517
+ }
518
+
519
+ .paper-table tbody tr {
520
+ transition: background 0.1s;
521
+ }
522
+
523
+ .paper-table tbody tr:hover td {
524
+ background: rgba(59, 130, 246, 0.04);
525
+ }
526
+
527
+ .paper-table td {
528
+ padding: 0.6rem 0.75rem;
529
+ border-bottom: 1px solid var(--border);
530
+ vertical-align: middle;
531
+ }
532
+
533
+ .paper-table .col-rank {
534
+ width: 2.5rem;
535
+ text-align: center;
536
+ color: var(--text-dim);
537
+ font-family: var(--font-mono);
538
+ font-size: 0.75rem;
539
+ }
540
+
541
+ .paper-table .col-score {
542
+ width: 4.5rem;
543
+ text-align: center;
544
+ font-family: var(--font-mono);
545
+ font-weight: 600;
546
+ cursor: help;
547
+ }
548
+
549
+ .paper-table .col-score.composite {
550
+ font-weight: 700;
551
+ }
552
+
553
+ .paper-table .col-code {
554
+ width: 2.5rem;
555
+ text-align: center;
556
+ }
557
+
558
+ .paper-table .col-summary {
559
+ color: var(--text-muted);
560
+ font-size: 0.8rem;
561
+ max-width: 300px;
562
+ }
563
+
564
+ .paper-table .paper-title-link {
565
+ color: var(--text);
566
+ font-weight: 500;
567
+ transition: color 0.15s;
568
+ }
569
+
570
+ .paper-table .paper-title-link:hover {
571
+ color: var(--accent-hover);
572
+ }
573
+
574
+ /* Score colors in table */
575
+ .score-high { color: var(--emerald); }
576
+ .score-mid { color: var(--amber); }
577
+ .score-low { color: var(--red); }
578
+
579
+ /* ─── Score Visualization ─── */
580
+ .score-track {
581
+ height: 6px;
582
+ background: var(--bg-deep);
583
+ border-radius: 3px;
584
+ overflow: hidden;
585
+ position: relative;
586
+ }
587
+
588
+ .score-track--sm { height: 4px; }
589
+
590
+ .score-track--lg { height: 8px; }
591
+
592
+ .score-fill {
593
+ height: 100%;
594
+ border-radius: 3px;
595
+ min-width: 2px;
596
+ animation: fillBar 0.8s cubic-bezier(0.16, 1, 0.3, 1) both;
597
+ }
598
+
599
+ .score-fill.high {
600
+ background: linear-gradient(90deg, #059669, #34d399);
601
+ box-shadow: 0 0 10px var(--emerald-glow);
602
+ }
603
+
604
+ .score-fill.mid {
605
+ background: linear-gradient(90deg, #d97706, #fbbf24);
606
+ box-shadow: 0 0 10px var(--amber-glow);
607
+ }
608
+
609
+ .score-fill.low {
610
+ background: linear-gradient(90deg, #dc2626, #f87171);
611
+ box-shadow: 0 0 10px var(--red-glow);
612
+ }
613
+
614
+ /* ─── Score Detail (Paper Detail Page) ─── */
615
+ .score-grid {
616
+ display: grid;
617
+ grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
618
+ gap: 1rem;
619
+ margin: 1.5rem 0;
620
+ }
621
+
622
+ .score-item {
623
+ background: var(--bg-card);
624
+ border: 1px solid var(--border);
625
+ padding: 1rem 1.25rem;
626
+ border-radius: var(--radius-lg);
627
+ }
628
+
629
+ .score-item--composite {
630
+ border-color: var(--border-accent);
631
+ background: linear-gradient(145deg, rgba(59,130,246,0.06) 0%, var(--bg-card) 60%);
632
+ }
633
+
634
+ .score-item .label {
635
+ font-size: 0.75rem;
636
+ font-weight: 600;
637
+ color: var(--text-muted);
638
+ text-transform: uppercase;
639
+ letter-spacing: 0.04em;
640
+ }
641
+
642
+ .score-item .score-value {
643
+ font-family: var(--font-mono);
644
+ font-size: 1.5rem;
645
+ font-weight: 700;
646
+ margin: 0.4rem 0;
647
+ display: flex;
648
+ align-items: baseline;
649
+ gap: 0.25rem;
650
+ }
651
+
652
+ .score-item .score-value .max {
653
+ font-size: 0.85rem;
654
+ color: var(--text-dim);
655
+ font-weight: 400;
656
+ }
657
+
658
+ .score-item .score-track {
659
+ margin-top: 0.5rem;
660
+ }
661
+
662
+ /* ─── Paper Detail ─── */
663
+ .paper-detail {
664
+ background: var(--bg-card);
665
+ border: 1px solid var(--border);
666
+ border-radius: var(--radius-xl);
667
+ padding: 2rem;
668
+ animation: fadeSlideUp 0.4s ease-out both;
669
+ }
670
+
671
+ .paper-detail .back-link {
672
+ font-size: 0.82rem;
673
+ color: var(--text-muted);
674
+ display: inline-flex;
675
+ align-items: center;
676
+ gap: 0.35rem;
677
+ transition: color 0.15s;
678
+ }
679
+
680
+ .paper-detail .back-link:hover {
681
+ color: var(--accent);
682
+ }
683
+
684
+ .paper-detail h1 {
685
+ font-family: var(--font-display);
686
+ font-size: 1.5rem;
687
+ font-weight: 700;
688
+ letter-spacing: -0.03em;
689
+ margin-top: 0.75rem;
690
+ margin-bottom: 0.5rem;
691
+ line-height: 1.3;
692
+ }
693
+
694
+ .paper-detail .authors {
695
+ color: var(--text-muted);
696
+ font-size: 0.875rem;
697
+ margin-bottom: 1.25rem;
698
+ }
699
+
700
+ .paper-summary {
701
+ margin: 1.25rem 0;
702
+ padding: 1rem 1.25rem;
703
+ background: var(--bg);
704
+ border-radius: var(--radius);
705
+ border-left: 3px solid var(--accent);
706
+ font-size: 0.9rem;
707
+ line-height: 1.7;
708
+ color: var(--text-secondary);
709
+ }
710
+
711
+ .paper-reasoning {
712
+ color: var(--text-muted);
713
+ font-style: italic;
714
+ font-size: 0.875rem;
715
+ margin: 0.75rem 0;
716
+ line-height: 1.6;
717
+ }
718
+
719
+ .paper-links {
720
+ display: flex;
721
+ gap: 0.5rem;
722
+ margin: 1.25rem 0;
723
+ flex-wrap: wrap;
724
+ }
725
+
726
+ .paper-links a {
727
+ padding: 0.4rem 0.85rem;
728
+ background: var(--bg);
729
+ border: 1px solid var(--border-strong);
730
+ border-radius: var(--radius);
731
+ font-size: 0.82rem;
732
+ font-weight: 500;
733
+ color: var(--text-secondary);
734
+ transition: border-color 0.15s, color 0.15s, background 0.15s;
735
+ }
736
+
737
+ .paper-links a:hover {
738
+ border-color: var(--accent);
739
+ color: var(--accent);
740
+ background: var(--accent-muted);
741
+ text-decoration: none;
742
+ }
743
+
744
+ .paper-abstract {
745
+ line-height: 1.75;
746
+ margin: 1.25rem 0;
747
+ padding: 1.25rem;
748
+ background: var(--bg);
749
+ border-radius: var(--radius-lg);
750
+ font-size: 0.9rem;
751
+ color: var(--text-secondary);
752
+ }
753
+
754
+ .paper-abstract strong {
755
+ color: var(--text);
756
+ font-size: 0.8rem;
757
+ text-transform: uppercase;
758
+ letter-spacing: 0.04em;
759
+ }
760
+
761
+ .paper-meta {
762
+ margin-top: 1rem;
763
+ font-size: 0.82rem;
764
+ color: var(--text-muted);
765
+ }
766
+
767
+ .paper-meta strong {
768
+ color: var(--text-secondary);
769
+ }
770
+
771
+ .context-block {
772
+ margin-top: 2rem;
773
+ padding: 1.25rem;
774
+ background: var(--bg);
775
+ border-radius: var(--radius-lg);
776
+ border: 1px solid var(--border);
777
+ }
778
+
779
+ .context-block .context-label {
780
+ font-size: 0.75rem;
781
+ font-weight: 600;
782
+ color: var(--text-muted);
783
+ text-transform: uppercase;
784
+ letter-spacing: 0.04em;
785
+ margin-bottom: 0.5rem;
786
+ }
787
+
788
+ .context-block pre {
789
+ font-family: var(--font-mono);
790
+ font-size: 0.78rem;
791
+ color: var(--text-muted);
792
+ white-space: pre-wrap;
793
+ line-height: 1.65;
794
+ }
795
+
796
+ /* ─── Filter Bar ─── */
797
+ .filter-bar {
798
+ background: var(--bg-card);
799
+ border: 1px solid var(--border);
800
+ border-radius: var(--radius-lg);
801
+ padding: 0.75rem 1rem;
802
+ margin-bottom: 1.25rem;
803
+ }
804
+
805
+ .filter-bar form {
806
+ display: flex;
807
+ gap: 0.75rem;
808
+ align-items: center;
809
+ flex-wrap: wrap;
810
+ width: 100%;
811
+ }
812
+
813
+ .filter-bar input[type="search"],
814
+ .filter-bar input[type="number"],
815
+ .filter-bar select {
816
+ background: var(--bg);
817
+ border: 1px solid var(--border-strong);
818
+ border-radius: var(--radius);
819
+ color: var(--text);
820
+ padding: 0.45rem 0.75rem;
821
+ font-size: 0.84rem;
822
+ font-family: var(--font-body);
823
+ transition: border-color 0.15s, box-shadow 0.15s;
824
+ }
825
+
826
+ .filter-bar input:focus,
827
+ .filter-bar select:focus {
828
+ outline: none;
829
+ border-color: var(--accent);
830
+ box-shadow: 0 0 0 2px var(--accent-muted);
831
+ }
832
+
833
+ .filter-bar input[type="search"] {
834
+ flex: 1;
835
+ min-width: 200px;
836
+ }
837
+
838
+ .filter-bar input[type="number"] {
839
+ width: 5rem;
840
+ }
841
+
842
+ .filter-bar label {
843
+ font-size: 0.8rem;
844
+ color: var(--text-muted);
845
+ display: flex;
846
+ align-items: center;
847
+ gap: 0.35rem;
848
+ cursor: pointer;
849
+ white-space: nowrap;
850
+ }
851
+
852
+ .filter-bar input[type="checkbox"] {
853
+ appearance: none;
854
+ width: 16px;
855
+ height: 16px;
856
+ border: 1.5px solid var(--border-strong);
857
+ border-radius: 4px;
858
+ background: var(--bg);
859
+ cursor: pointer;
860
+ position: relative;
861
+ transition: background 0.15s, border-color 0.15s;
862
+ }
863
+
864
+ .filter-bar input[type="checkbox"]:checked {
865
+ background: var(--accent);
866
+ border-color: var(--accent);
867
+ }
868
+
869
+ .filter-bar input[type="checkbox"]:checked::after {
870
+ content: '';
871
+ position: absolute;
872
+ left: 4px;
873
+ top: 1px;
874
+ width: 5px;
875
+ height: 9px;
876
+ border: solid var(--bg-deep);
877
+ border-width: 0 2px 2px 0;
878
+ transform: rotate(45deg);
879
+ }
880
+
881
+ .filter-bar input[type="search"]::placeholder {
882
+ color: var(--text-dim);
883
+ }
884
+
885
+ /* ─── Events ─── */
886
+ .event-section {
887
+ margin-bottom: 2rem;
888
+ }
889
+
890
+ .event-section .section-title {
891
+ font-size: 1rem;
892
+ }
893
+
894
+ .event-card {
895
+ background: var(--bg-card);
896
+ border: 1px solid var(--border);
897
+ border-radius: var(--radius);
898
+ padding: 0.85rem 1rem;
899
+ margin-bottom: 0.5rem;
900
+ border-left: 3px solid transparent;
901
+ transition: border-color 0.15s, transform 0.15s;
902
+ }
903
+
904
+ .event-card:hover {
905
+ transform: translateX(2px);
906
+ }
907
+
908
+ .event-card--conference { border-left-color: var(--purple); }
909
+ .event-card--release { border-left-color: var(--emerald); }
910
+ .event-card--news { border-left-color: var(--accent); }
911
+
912
+ .event-card .event-title {
913
+ font-weight: 600;
914
+ font-size: 0.875rem;
915
+ }
916
+
917
+ .event-card .event-title a {
918
+ color: var(--text);
919
+ transition: color 0.15s;
920
+ }
921
+
922
+ .event-card .event-title a:hover {
923
+ color: var(--accent-hover);
924
+ }
925
+
926
+ .event-card .event-meta {
927
+ font-size: 0.78rem;
928
+ color: var(--text-muted);
929
+ margin-top: 0.2rem;
930
+ }
931
+
932
+ .event-card .event-desc {
933
+ font-size: 0.82rem;
934
+ color: var(--text-muted);
935
+ margin-top: 0.35rem;
936
+ line-height: 1.5;
937
+ }
938
+
939
+ /* ─── Buttons ─── */
940
+ .btn {
941
+ display: inline-flex;
942
+ align-items: center;
943
+ justify-content: center;
944
+ gap: 0.4rem;
945
+ padding: 0.5rem 1rem;
946
+ border-radius: var(--radius);
947
+ font-size: 0.84rem;
948
+ font-weight: 600;
949
+ font-family: var(--font-body);
950
+ cursor: pointer;
951
+ border: 1px solid var(--border-strong);
952
+ background: var(--bg-card);
953
+ color: var(--text-secondary);
954
+ transition: all 0.15s;
955
+ white-space: nowrap;
956
+ }
957
+
958
+ .btn:hover {
959
+ background: var(--bg-surface);
960
+ border-color: var(--text-dim);
961
+ color: var(--text);
962
+ text-decoration: none;
963
+ }
964
+
965
+ .btn-primary {
966
+ background: var(--accent);
967
+ color: var(--bg-deep);
968
+ border-color: var(--accent);
969
+ font-weight: 700;
970
+ }
971
+
972
+ .btn-primary:hover {
973
+ background: var(--accent-hover);
974
+ border-color: var(--accent-hover);
975
+ color: var(--bg-deep);
976
+ box-shadow: 0 0 16px rgba(59, 130, 246, 0.25);
977
+ }
978
+
979
+ .btn-sm {
980
+ padding: 0.35rem 0.7rem;
981
+ font-size: 0.78rem;
982
+ }
983
+
984
+ .btn-ghost {
985
+ background: transparent;
986
+ border-color: transparent;
987
+ color: var(--text-muted);
988
+ }
989
+
990
+ .btn-ghost:hover {
991
+ background: var(--accent-muted);
992
+ color: var(--accent);
993
+ border-color: transparent;
994
+ }
995
+
996
+ /* ─── Action Row ─── */
997
+ .action-row {
998
+ display: flex;
999
+ gap: 0.75rem;
1000
+ margin-top: 1.5rem;
1001
+ flex-wrap: wrap;
1002
+ }
1003
+
1004
+ /* ─── Pagination ─── */
1005
+ .pagination {
1006
+ display: flex;
1007
+ gap: 0.5rem;
1008
+ margin-top: 1.5rem;
1009
+ justify-content: center;
1010
+ align-items: center;
1011
+ }
1012
+
1013
+ .pagination .page-info {
1014
+ color: var(--text-muted);
1015
+ font-size: 0.82rem;
1016
+ font-family: var(--font-mono);
1017
+ padding: 0 0.75rem;
1018
+ }
1019
+
1020
+ /* ─── Empty State ─── */
1021
+ .empty-state {
1022
+ text-align: center;
1023
+ padding: 4rem 2rem;
1024
+ color: var(--text-muted);
1025
+ }
1026
+
1027
+ .empty-state h2 {
1028
+ font-family: var(--font-display);
1029
+ color: var(--text-secondary);
1030
+ margin-bottom: 0.5rem;
1031
+ font-weight: 600;
1032
+ }
1033
+
1034
+ .empty-state p {
1035
+ max-width: 400px;
1036
+ margin: 0 auto;
1037
+ font-size: 0.9rem;
1038
+ }
1039
+
1040
+ /* ─── Status Pills ─── */
1041
+ .status-running {
1042
+ color: var(--amber);
1043
+ position: relative;
1044
+ padding-left: 14px;
1045
+ }
1046
+
1047
+ .status-running::before {
1048
+ content: '';
1049
+ position: absolute;
1050
+ left: 0;
1051
+ top: 50%;
1052
+ transform: translateY(-50%);
1053
+ width: 6px;
1054
+ height: 6px;
1055
+ background: var(--amber);
1056
+ border-radius: 50%;
1057
+ animation: pulse 1.5s ease-in-out infinite;
1058
+ }
1059
+
1060
+ .status-completed {
1061
+ color: var(--emerald);
1062
+ padding-left: 14px;
1063
+ position: relative;
1064
+ }
1065
+
1066
+ .status-completed::before {
1067
+ content: '';
1068
+ position: absolute;
1069
+ left: 0;
1070
+ top: 50%;
1071
+ transform: translateY(-50%);
1072
+ width: 6px;
1073
+ height: 6px;
1074
+ background: var(--emerald);
1075
+ border-radius: 50%;
1076
+ }
1077
+
1078
+ .status-failed {
1079
+ color: var(--red);
1080
+ padding-left: 14px;
1081
+ position: relative;
1082
+ }
1083
+
1084
+ .status-failed::before {
1085
+ content: '';
1086
+ position: absolute;
1087
+ left: 0;
1088
+ top: 50%;
1089
+ transform: translateY(-50%);
1090
+ width: 6px;
1091
+ height: 6px;
1092
+ background: var(--red);
1093
+ border-radius: 50%;
1094
+ }
1095
+
1096
+ /* ─── HTMX ─── */
1097
+ .htmx-indicator {
1098
+ display: none;
1099
+ }
1100
+
1101
+ .htmx-request .htmx-indicator,
1102
+ .htmx-request.htmx-indicator {
1103
+ display: block;
1104
+ }
1105
+
1106
+ .spinner {
1107
+ width: 14px;
1108
+ height: 14px;
1109
+ border: 2px solid var(--border-strong);
1110
+ border-top-color: var(--accent);
1111
+ border-radius: 50%;
1112
+ animation: spin 0.6s linear infinite;
1113
+ }
1114
+
1115
+ /* ─── Code check in table ─── */
1116
+ .code-check {
1117
+ color: var(--emerald);
1118
+ font-size: 0.9rem;
1119
+ }
1120
+
1121
+ .code-dash {
1122
+ color: var(--text-dim);
1123
+ }
1124
+
1125
+ /* ─── Animations ─── */
1126
+ @keyframes fadeSlideUp {
1127
+ from {
1128
+ opacity: 0;
1129
+ transform: translateY(12px);
1130
+ }
1131
+ to {
1132
+ opacity: 1;
1133
+ transform: translateY(0);
1134
+ }
1135
+ }
1136
+
1137
+ @keyframes fillBar {
1138
+ from { width: 0 !important; }
1139
+ }
1140
+
1141
+ @keyframes pulse {
1142
+ 0%, 100% { opacity: 1; }
1143
+ 50% { opacity: 0.4; }
1144
+ }
1145
+
1146
+ @keyframes spin {
1147
+ to { transform: rotate(360deg); }
1148
+ }
1149
+
1150
+ @keyframes loadSlide {
1151
+ 0% { transform: translateX(-100%); }
1152
+ 100% { transform: translateX(200%); }
1153
+ }
1154
+
1155
+ @keyframes fadeIn {
1156
+ from { opacity: 0; }
1157
+ to { opacity: 1; }
1158
+ }
1159
+
1160
+ /* ─── Connected Papers ─── */
1161
+ .connected-papers {
1162
+ margin-top: 1rem;
1163
+ }
1164
+
1165
+ .connection-group {
1166
+ margin-bottom: 1.5rem;
1167
+ }
1168
+
1169
+ .connection-group__label {
1170
+ font-size: 0.8rem;
1171
+ font-weight: 600;
1172
+ color: var(--text-muted);
1173
+ text-transform: uppercase;
1174
+ letter-spacing: 0.04em;
1175
+ margin-bottom: 0.5rem;
1176
+ display: flex;
1177
+ align-items: center;
1178
+ gap: 0.5rem;
1179
+ }
1180
+
1181
+ .connection-list {
1182
+ background: var(--bg);
1183
+ border-radius: var(--radius-lg);
1184
+ border: 1px solid var(--border);
1185
+ overflow: hidden;
1186
+ }
1187
+
1188
+ .connection-item {
1189
+ display: flex;
1190
+ align-items: center;
1191
+ gap: 0.75rem;
1192
+ padding: 0.5rem 1rem;
1193
+ border-bottom: 1px solid var(--border);
1194
+ font-size: 0.82rem;
1195
+ transition: background 0.1s;
1196
+ }
1197
+
1198
+ .connection-item:last-child {
1199
+ border-bottom: none;
1200
+ }
1201
+
1202
+ .connection-item:hover {
1203
+ background: rgba(59, 130, 246, 0.04);
1204
+ }
1205
+
1206
+ .connection-item--in-db {
1207
+ background: rgba(16, 185, 129, 0.03);
1208
+ }
1209
+
1210
+ .connection-item--in-db:hover {
1211
+ background: rgba(16, 185, 129, 0.06);
1212
+ }
1213
+
1214
+ .connection-title {
1215
+ flex: 1;
1216
+ min-width: 0;
1217
+ overflow: hidden;
1218
+ text-overflow: ellipsis;
1219
+ white-space: nowrap;
1220
+ }
1221
+
1222
+ .connection-title a {
1223
+ color: var(--text-secondary);
1224
+ }
1225
+
1226
+ .connection-title a:hover {
1227
+ color: var(--accent-hover);
1228
+ }
1229
+
1230
+ .connection-year {
1231
+ color: var(--text-dim);
1232
+ font-family: var(--font-mono);
1233
+ font-size: 0.75rem;
1234
+ flex-shrink: 0;
1235
+ }
1236
+
1237
+ /* ─── Filter select ─── */
1238
+ .filter-bar select {
1239
+ min-width: 120px;
1240
+ }
1241
+
1242
+ /* ─── Focus States ─── */
1243
+ a:focus-visible,
1244
+ button:focus-visible,
1245
+ input:focus-visible,
1246
+ select:focus-visible {
1247
+ outline: 2px solid var(--accent);
1248
+ outline-offset: 2px;
1249
+ }
1250
+
1251
+ /* ─── Responsive ─── */
1252
+ @media (max-width: 1024px) {
1253
+ .stats-grid {
1254
+ grid-template-columns: repeat(2, 1fr);
1255
+ }
1256
+
1257
+ .three-col {
1258
+ grid-template-columns: 1fr 1fr;
1259
+ }
1260
+ }
1261
+
1262
+ @media (max-width: 768px) {
1263
+ :root {
1264
+ --nav-height: 50px;
1265
+ }
1266
+
1267
+ nav {
1268
+ padding: 0 1rem;
1269
+ gap: 1rem;
1270
+ height: var(--nav-height);
1271
+ }
1272
+
1273
+ .logo {
1274
+ font-size: 1rem;
1275
+ }
1276
+
1277
+ .nav-links a {
1278
+ font-size: 0.8rem;
1279
+ padding: 0.3rem 0.5rem;
1280
+ }
1281
+
1282
+ .container {
1283
+ padding: 1.25rem 1rem;
1284
+ }
1285
+
1286
+ .page-header h1 {
1287
+ font-size: 1.35rem;
1288
+ }
1289
+
1290
+ .stats-grid {
1291
+ grid-template-columns: repeat(2, 1fr);
1292
+ gap: 0.75rem;
1293
+ }
1294
+
1295
+ .two-col {
1296
+ grid-template-columns: 1fr;
1297
+ gap: 1.5rem;
1298
+ }
1299
+
1300
+ .three-col {
1301
+ grid-template-columns: 1fr;
1302
+ }
1303
+
1304
+ .paper-table .col-summary { display: none; }
1305
+
1306
+ .filter-bar form {
1307
+ gap: 0.5rem;
1308
+ }
1309
+
1310
+ .filter-bar input[type="search"] {
1311
+ min-width: 150px;
1312
+ }
1313
+
1314
+ .paper-detail {
1315
+ padding: 1.25rem;
1316
+ }
1317
+
1318
+ .score-grid {
1319
+ grid-template-columns: repeat(2, 1fr);
1320
+ }
1321
+ }
1322
+
1323
+ @media (max-width: 480px) {
1324
+ :root {
1325
+ --nav-height: auto;
1326
+ }
1327
+
1328
+ nav {
1329
+ flex-wrap: wrap;
1330
+ height: auto;
1331
+ padding: 0.75rem 1rem;
1332
+ }
1333
+
1334
+ .nav-links {
1335
+ width: 100%;
1336
+ overflow-x: auto;
1337
+ -webkit-overflow-scrolling: touch;
1338
+ padding-bottom: 0.25rem;
1339
+ }
1340
+
1341
+ .stats-grid {
1342
+ grid-template-columns: 1fr 1fr;
1343
+ }
1344
+
1345
+ .stat-card {
1346
+ padding: 0.85rem 1rem;
1347
+ }
1348
+
1349
+ .stat-card .value {
1350
+ font-size: 1.35rem;
1351
+ }
1352
+
1353
+ .paper-table th:nth-child(n+4):nth-child(-n+6),
1354
+ .paper-table td:nth-child(n+4):nth-child(-n+6) {
1355
+ display: none;
1356
+ }
1357
+
1358
+ .score-grid {
1359
+ grid-template-columns: 1fr;
1360
+ }
1361
+
1362
+ .paper-links {
1363
+ flex-direction: column;
1364
+ }
1365
+
1366
+ .paper-links a {
1367
+ text-align: center;
1368
+ }
1369
+
1370
+ .action-row {
1371
+ flex-direction: column;
1372
+ }
1373
+
1374
+ .action-row .btn {
1375
+ width: 100%;
1376
+ }
1377
+ }
1378
+
1379
+ /* ─── Toast notifications ─── */
1380
+ .toast-container {
1381
+ position: fixed;
1382
+ bottom: 1.5rem;
1383
+ right: 1.5rem;
1384
+ z-index: 9999;
1385
+ display: flex;
1386
+ flex-direction: column;
1387
+ gap: 0.5rem;
1388
+ }
1389
+
1390
+ .toast {
1391
+ background: var(--bg-card);
1392
+ border: 1px solid var(--border-strong);
1393
+ border-left: 3px solid var(--accent);
1394
+ border-radius: 6px;
1395
+ padding: 0.75rem 1rem;
1396
+ font-size: 0.85rem;
1397
+ color: var(--text);
1398
+ box-shadow: 0 8px 24px rgba(0, 0, 0, 0.5);
1399
+ animation: toast-in 0.3s ease-out, toast-out 0.3s ease-in 3.7s forwards;
1400
+ max-width: 340px;
1401
+ }
1402
+
1403
+ .toast--success { border-left-color: var(--emerald); }
1404
+ .toast--warning { border-left-color: var(--amber); }
1405
+ .toast--error { border-left-color: var(--red); }
1406
+
1407
+ @keyframes toast-in {
1408
+ from { opacity: 0; transform: translateY(1rem); }
1409
+ to { opacity: 1; transform: translateY(0); }
1410
+ }
1411
+
1412
+ @keyframes toast-out {
1413
+ from { opacity: 1; }
1414
+ to { opacity: 0; transform: translateY(-0.5rem); }
1415
+ }
1416
+
1417
+ /* ─── Signal Buttons ─── */
1418
+ .signal-buttons {
1419
+ display: inline-flex;
1420
+ gap: 2px;
1421
+ align-items: center;
1422
+ }
1423
+
1424
+ .signal-btn {
1425
+ background: transparent;
1426
+ border: 1px solid transparent;
1427
+ border-radius: 4px;
1428
+ color: var(--text-dim);
1429
+ cursor: pointer;
1430
+ font-size: 0.7rem;
1431
+ padding: 2px 4px;
1432
+ line-height: 1;
1433
+ transition: color 0.15s, background 0.15s, border-color 0.15s;
1434
+ }
1435
+
1436
+ .signal-btn:hover {
1437
+ background: var(--bg-surface);
1438
+ border-color: var(--border-strong);
1439
+ }
1440
+
1441
+ .signal-btn--up:hover,
1442
+ .signal-btn--up.active {
1443
+ color: var(--emerald);
1444
+ background: var(--emerald-glow);
1445
+ border-color: rgba(16, 185, 129, 0.3);
1446
+ }
1447
+
1448
+ .signal-btn--down:hover,
1449
+ .signal-btn--down.active {
1450
+ color: var(--red);
1451
+ background: var(--red-glow);
1452
+ border-color: rgba(239, 68, 68, 0.3);
1453
+ }
1454
+
1455
+ .signal-btn--dismiss:hover,
1456
+ .signal-btn--dismiss.active {
1457
+ color: var(--text-muted);
1458
+ background: var(--bg-surface);
1459
+ border-color: var(--border-strong);
1460
+ }
1461
+
1462
+ /* Show signal buttons on row hover (desktop) */
1463
+ .paper-table .col-signals {
1464
+ width: 5rem;
1465
+ text-align: center;
1466
+ }
1467
+
1468
+ .paper-table .col-signals-header {
1469
+ width: 5rem;
1470
+ text-align: center;
1471
+ }
1472
+
1473
+ .paper-table tbody tr .signal-buttons {
1474
+ opacity: 0.3;
1475
+ transition: opacity 0.15s;
1476
+ }
1477
+
1478
+ .paper-table tbody tr:hover .signal-buttons {
1479
+ opacity: 1;
1480
+ }
1481
+
1482
+ /* Always show if a signal is active */
1483
+ .paper-table tbody tr .signal-buttons:has(.active) {
1484
+ opacity: 1;
1485
+ }
1486
+
1487
+ /* ─── Boost Indicators ─── */
1488
+ .boost-arrow {
1489
+ font-size: 0.6rem;
1490
+ margin-left: 2px;
1491
+ vertical-align: middle;
1492
+ }
1493
+
1494
+ .boost-up {
1495
+ color: var(--emerald);
1496
+ }
1497
+
1498
+ .boost-down {
1499
+ color: var(--red);
1500
+ }
1501
+
1502
+ .score-raw {
1503
+ font-size: 0.6rem;
1504
+ color: var(--text-dim);
1505
+ font-weight: 400;
1506
+ margin-top: 1px;
1507
+ }
1508
+
1509
+ .boost-pip {
1510
+ font-size: 0.55rem;
1511
+ line-height: 1;
1512
+ }
1513
+
1514
+ .boost-pip--up {
1515
+ color: var(--emerald);
1516
+ }
1517
+
1518
+ .boost-pip--down {
1519
+ color: var(--red);
1520
+ }
1521
+
1522
+ /* Boost detail in score item */
1523
+ .boost-detail {
1524
+ margin-top: 0.5rem;
1525
+ font-size: 0.75rem;
1526
+ display: flex;
1527
+ align-items: center;
1528
+ gap: 0.35rem;
1529
+ flex-wrap: wrap;
1530
+ }
1531
+
1532
+ .boost-label {
1533
+ color: var(--text-muted);
1534
+ }
1535
+
1536
+ .boost-value {
1537
+ font-family: var(--font-mono);
1538
+ font-weight: 600;
1539
+ font-size: 0.8rem;
1540
+ }
1541
+
1542
+ /* ─── Discovery Badge ─── */
1543
+ .badge--discover {
1544
+ background: var(--purple-glow);
1545
+ color: var(--purple);
1546
+ font-size: 0.55rem;
1547
+ padding: 0.1rem 0.4rem;
1548
+ margin-left: 0.35rem;
1549
+ vertical-align: middle;
1550
+ }
1551
+
1552
+ /* ─── Preference Explanation (Paper Detail) ─── */
1553
+ .pref-explanation {
1554
+ background: var(--bg);
1555
+ border: 1px solid var(--border);
1556
+ border-left: 3px solid var(--purple);
1557
+ border-radius: var(--radius);
1558
+ padding: 0.75rem 1rem;
1559
+ margin: 0.75rem 0;
1560
+ }
1561
+
1562
+ .pref-explanation__label {
1563
+ font-size: 0.7rem;
1564
+ font-weight: 600;
1565
+ text-transform: uppercase;
1566
+ letter-spacing: 0.04em;
1567
+ color: var(--purple);
1568
+ margin-bottom: 0.4rem;
1569
+ }
1570
+
1571
+ .pref-explanation__reasons {
1572
+ display: flex;
1573
+ gap: 0.5rem;
1574
+ flex-wrap: wrap;
1575
+ }
1576
+
1577
+ .pref-reason {
1578
+ font-family: var(--font-mono);
1579
+ font-size: 0.75rem;
1580
+ color: var(--text-secondary);
1581
+ background: var(--bg-card);
1582
+ padding: 0.2rem 0.5rem;
1583
+ border-radius: 4px;
1584
+ border: 1px solid var(--border);
1585
+ }
1586
+
1587
+ /* ─── Cold Start Hint ─── */
1588
+ .cold-start-hint {
1589
+ text-align: center;
1590
+ padding: 0.75rem;
1591
+ color: var(--text-dim);
1592
+ font-size: 0.82rem;
1593
+ border-top: 1px solid var(--border);
1594
+ margin-top: 0.5rem;
1595
+ }
1596
+
1597
+ /* ─── Preferences Page ─── */
1598
+ .pref-groups {
1599
+ display: grid;
1600
+ gap: 1.5rem;
1601
+ }
1602
+
1603
+ .pref-group {
1604
+ background: var(--bg-card);
1605
+ border: 1px solid var(--border);
1606
+ border-radius: var(--radius-lg);
1607
+ padding: 1.25rem;
1608
+ }
1609
+
1610
+ .pref-group .section-header {
1611
+ margin-bottom: 0.75rem;
1612
+ padding-bottom: 0.5rem;
1613
+ }
1614
+
1615
+ .pref-list {
1616
+ display: flex;
1617
+ flex-direction: column;
1618
+ gap: 0.35rem;
1619
+ }
1620
+
1621
+ .pref-item {
1622
+ display: flex;
1623
+ align-items: center;
1624
+ gap: 0.75rem;
1625
+ padding: 0.4rem 0.5rem;
1626
+ border-radius: 4px;
1627
+ transition: background 0.1s;
1628
+ }
1629
+
1630
+ .pref-item:hover {
1631
+ background: rgba(59, 130, 246, 0.04);
1632
+ }
1633
+
1634
+ .pref-item__name {
1635
+ font-size: 0.84rem;
1636
+ color: var(--text);
1637
+ min-width: 160px;
1638
+ overflow: hidden;
1639
+ text-overflow: ellipsis;
1640
+ white-space: nowrap;
1641
+ }
1642
+
1643
+ .pref-item__count {
1644
+ font-family: var(--font-mono);
1645
+ font-size: 0.7rem;
1646
+ color: var(--text-dim);
1647
+ min-width: 2.5rem;
1648
+ text-align: right;
1649
+ }
1650
+
1651
+ .pref-bar-container {
1652
+ flex: 1;
1653
+ height: 6px;
1654
+ background: var(--bg-deep);
1655
+ border-radius: 3px;
1656
+ overflow: hidden;
1657
+ min-width: 60px;
1658
+ }
1659
+
1660
+ .pref-bar {
1661
+ height: 100%;
1662
+ border-radius: 3px;
1663
+ min-width: 2px;
1664
+ transition: width 0.4s ease;
1665
+ }
1666
+
1667
+ .pref-bar--positive {
1668
+ background: linear-gradient(90deg, #059669, #34d399);
1669
+ }
1670
+
1671
+ .pref-bar--negative {
1672
+ background: linear-gradient(90deg, #dc2626, #f87171);
1673
+ float: right;
1674
+ }
1675
+
1676
+ .pref-item__value {
1677
+ font-family: var(--font-mono);
1678
+ font-size: 0.78rem;
1679
+ font-weight: 600;
1680
+ min-width: 3.5rem;
1681
+ text-align: right;
1682
+ }
1683
+
1684
+ .pref-positive {
1685
+ color: var(--emerald);
1686
+ }
1687
+
1688
+ .pref-negative {
1689
+ color: var(--red);
1690
+ }
1691
+
1692
+ /* Responsive: preferences page */
1693
+ @media (max-width: 768px) {
1694
+ .pref-item__name {
1695
+ min-width: 100px;
1696
+ }
1697
+
1698
+ .pref-bar-container {
1699
+ min-width: 40px;
1700
+ }
1701
+ }
src/web/static/sw.js ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const CACHE_NAME = 'ri-v4';
2
+ const STATIC_ASSETS = [
3
+ '/static/style.css',
4
+ '/static/htmx.min.js',
5
+ '/static/favicon.svg',
6
+ '/static/manifest.json',
7
+ ];
8
+
9
+ // Install: pre-cache static assets
10
+ self.addEventListener('install', (event) => {
11
+ event.waitUntil(
12
+ caches.open(CACHE_NAME).then((cache) => cache.addAll(STATIC_ASSETS))
13
+ );
14
+ self.skipWaiting();
15
+ });
16
+
17
+ // Activate: clean up old caches
18
+ self.addEventListener('activate', (event) => {
19
+ event.waitUntil(
20
+ caches.keys().then((keys) =>
21
+ Promise.all(keys.filter((k) => k !== CACHE_NAME).map((k) => caches.delete(k)))
22
+ )
23
+ );
24
+ self.clients.claim();
25
+ });
26
+
27
+ // Fetch: network-first for pages, cache-first for static assets
28
+ self.addEventListener('fetch', (event) => {
29
+ const url = new URL(event.request.url);
30
+
31
+ // Static assets: cache-first
32
+ if (url.pathname.startsWith('/static/')) {
33
+ event.respondWith(
34
+ caches.match(event.request).then((cached) => {
35
+ if (cached) return cached;
36
+ return fetch(event.request).then((response) => {
37
+ const clone = response.clone();
38
+ caches.open(CACHE_NAME).then((cache) => cache.put(event.request, clone));
39
+ return response;
40
+ });
41
+ })
42
+ );
43
+ return;
44
+ }
45
+
46
+ // Google Fonts: cache-first
47
+ if (url.hostname.includes('fonts.googleapis.com') || url.hostname.includes('fonts.gstatic.com')) {
48
+ event.respondWith(
49
+ caches.match(event.request).then((cached) => {
50
+ if (cached) return cached;
51
+ return fetch(event.request).then((response) => {
52
+ const clone = response.clone();
53
+ caches.open(CACHE_NAME).then((cache) => cache.put(event.request, clone));
54
+ return response;
55
+ });
56
+ })
57
+ );
58
+ return;
59
+ }
60
+
61
+ // HTML pages: network-first with cache fallback
62
+ if (event.request.mode === 'navigate' || event.request.headers.get('accept')?.includes('text/html')) {
63
+ event.respondWith(
64
+ fetch(event.request)
65
+ .then((response) => {
66
+ const clone = response.clone();
67
+ caches.open(CACHE_NAME).then((cache) => cache.put(event.request, clone));
68
+ return response;
69
+ })
70
+ .catch(() => caches.match(event.request).then((cached) => cached || caches.match('/')))
71
+ );
72
+ return;
73
+ }
74
+
75
+ // Everything else: network with cache fallback
76
+ event.respondWith(
77
+ fetch(event.request).catch(() => caches.match(event.request))
78
+ );
79
+ });
src/web/templates/base.html ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1">
6
+ <title>{% block title %}Research Intelligence{% endblock %}</title>
7
+ <meta name="description" content="Research paper triage — AI/ML and Security">
8
+ <meta name="theme-color" content="#0b1121">
9
+ <meta name="apple-mobile-web-app-capable" content="yes">
10
+ <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
11
+ <meta name="apple-mobile-web-app-title" content="Research">
12
+ <link rel="manifest" href="/static/manifest.json">
13
+ <link rel="icon" href="/static/favicon.svg" type="image/svg+xml">
14
+ <link rel="apple-touch-icon" href="/static/favicon-192.png">
15
+ <link rel="stylesheet" href="/static/style.css">
16
+ <script src="/static/htmx.min.js"></script>
17
+ </head>
18
+ <body>
19
+ <div class="page-loader htmx-indicator" id="page-loader"></div>
20
+ <nav>
21
+ <a href="/" class="logo" style="text-decoration:none">
22
+ <span class="logo-dot"></span>
23
+ Research Intelligence
24
+ </a>
25
+ <div class="nav-links">
26
+ <a href="/" class="{% if active == 'dashboard' %}active{% endif %}">Dashboard</a>
27
+ <a href="/papers/aiml" class="{% if active == 'aiml' %}active{% endif %}">AI / ML</a>
28
+ <a href="/papers/security" class="{% if active == 'security' %}active{% endif %}">Security</a>
29
+ <a href="/github" class="{% if active == 'github' %}active{% endif %}">GitHub</a>
30
+ <a href="/events" class="{% if active == 'events' %}active{% endif %}">Events</a>
31
+ <a href="/weeks" class="{% if active == 'weeks' %}active{% endif %}">Archive</a>
32
+ <a href="/preferences" class="{% if active == 'preferences' %}active{% endif %}" title="Preferences">&#9881;</a>
33
+ </div>
34
+ </nav>
35
+ <div class="container">
36
+ {% block content %}{% endblock %}
37
+ </div>
38
+ <div class="toast-container" id="toasts"></div>
39
+ <script>
40
+ if ('serviceWorker' in navigator) {
41
+ navigator.serviceWorker.register('/sw.js');
42
+ }
43
+ // Toast helper
44
+ function showToast(msg, type) {
45
+ var c = document.getElementById('toasts');
46
+ var t = document.createElement('div');
47
+ t.className = 'toast' + (type ? ' toast--' + type : '');
48
+ t.textContent = msg;
49
+ c.appendChild(t);
50
+ setTimeout(function() { t.remove(); }, 4000);
51
+ }
52
+ // Scroll to top on HTMX content swap (pagination)
53
+ document.body.addEventListener('htmx:afterSwap', function(e) {
54
+ if (e.detail.target.id === 'paper-results' || e.detail.target.id === 'gh-results') {
55
+ e.detail.target.scrollIntoView({behavior: 'smooth', block: 'start'});
56
+ }
57
+ });
58
+ </script>
59
+ </body>
60
+ </html>
src/web/templates/dashboard.html ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}Dashboard — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <h1>Week of {{ week_label }}</h1>
6
+ <div class="subtitle">Research triage overview</div>
7
+ </div>
8
+
9
+ {% if show_seed_banner is defined and show_seed_banner %}
10
+ <div style="background:linear-gradient(135deg, rgba(167,139,250,0.08), rgba(59,130,246,0.06)); border:1px solid var(--border); border-left:3px solid var(--purple); border-radius:var(--radius-lg); padding:1rem 1.25rem; margin-bottom:1.5rem; display:flex; justify-content:space-between; align-items:center; flex-wrap:wrap; gap:0.75rem">
11
+ <div>
12
+ <div style="font-weight:600; font-size:0.9rem">New here?</div>
13
+ <div style="font-size:0.82rem; color:var(--text-muted)">Pick some papers you like to personalize your feed.</div>
14
+ </div>
15
+ <a href="/seed-preferences" class="btn btn-sm" style="border-color:var(--purple); color:var(--purple)">Pick Papers</a>
16
+ </div>
17
+ {% endif %}
18
+
19
+ <div class="stats-grid">
20
+ <div class="stat-card stat-card--blue">
21
+ <div class="label">AI/ML Papers</div>
22
+ <div class="value">{{ aiml_count }}</div>
23
+ </div>
24
+ <div class="stat-card stat-card--red">
25
+ <div class="label">Security Papers</div>
26
+ <div class="value">{{ security_count }}</div>
27
+ </div>
28
+ <div class="stat-card stat-card--green">
29
+ <div class="label">GitHub Projects</div>
30
+ <div class="value">{{ github_count }}</div>
31
+ </div>
32
+ <div class="stat-card stat-card--purple">
33
+ <div class="label">Events Tracked</div>
34
+ <div class="value">{{ event_count }}</div>
35
+ </div>
36
+ <div class="stat-card stat-card--purple" style="opacity:0.8">
37
+ <div class="label">Last Run</div>
38
+ <div class="value value--small">{{ (last_run or "") | replace("T", " ") or "Never" }}</div>
39
+ </div>
40
+ </div>
41
+
42
+ <div class="two-col">
43
+ <div>
44
+ <div class="section-header">
45
+ <h2>AI/ML Top 5</h2>
46
+ <span class="badge badge--accent">AI/ML</span>
47
+ </div>
48
+ {% if aiml_top %}
49
+ {% for p in aiml_top %}
50
+ {% set rank = loop.index %}
51
+ {% include "partials/paper_card.html" %}
52
+ {% endfor %}
53
+ {% else %}
54
+ <div class="empty-state" style="padding:2rem">
55
+ <p>No AI/ML papers scored yet.</p>
56
+ </div>
57
+ {% endif %}
58
+ </div>
59
+ <div>
60
+ <div class="section-header">
61
+ <h2>Security Top 5</h2>
62
+ <span class="badge badge--red">Security</span>
63
+ </div>
64
+ {% if security_top %}
65
+ {% for p in security_top %}
66
+ {% set rank = loop.index %}
67
+ {% include "partials/paper_card.html" %}
68
+ {% endfor %}
69
+ {% else %}
70
+ <div class="empty-state" style="padding:2rem">
71
+ <p>No security papers scored yet.</p>
72
+ </div>
73
+ {% endif %}
74
+ </div>
75
+ </div>
76
+
77
+ {% if events_grouped %}
78
+ <div class="section-header" style="margin-top:0.5rem">
79
+ <h2>Events This Week</h2>
80
+ </div>
81
+ <div class="events-auto-grid">
82
+ {% for cat, cat_events in events_grouped.items() %}
83
+ <div class="event-section">
84
+ <div class="section-title" style="font-size:0.95rem">{{ cat | title }}</div>
85
+ {% for e in cat_events[:5] %}
86
+ <div class="event-card event-card--{{ cat }}">
87
+ <div class="event-title">
88
+ {% if e.url %}<a href="{{ e.url }}">{{ e.title }}</a>{% else %}{{ e.title }}{% endif %}
89
+ </div>
90
+ <div class="event-meta">{{ e.source }}{% if e.event_date %} · {% if cat == 'conference' %}<span style="color:var(--amber)">{{ e.event_date | format_date('medium') }}</span>{% else %}{{ e.event_date | format_date('medium') }}{% endif %}{% endif %}</div>
91
+ </div>
92
+ {% endfor %}
93
+ </div>
94
+ {% endfor %}
95
+ </div>
96
+ {% endif %}
97
+
98
+ <div class="action-row">
99
+ <button type="button" class="btn btn-primary" id="btn-aiml"
100
+ {% if 'aiml' in running_pipelines %}disabled style="opacity:0.6"{% endif %}
101
+ onclick="triggerPipeline('aiml', this)">
102
+ {% if 'aiml' in running_pipelines %}Running...{% else %}Run AI/ML Pipeline{% endif %}
103
+ </button>
104
+ <button type="button" class="btn btn-primary" id="btn-security"
105
+ {% if 'security' in running_pipelines %}disabled style="opacity:0.6"{% endif %}
106
+ onclick="triggerPipeline('security', this)">
107
+ {% if 'security' in running_pipelines %}Running...{% else %}Run Security Pipeline{% endif %}
108
+ </button>
109
+ <button type="button" class="btn" id="btn-github"
110
+ {% if 'github' in running_pipelines %}disabled style="opacity:0.6"{% endif %}
111
+ onclick="triggerPipeline('github', this)">
112
+ {% if 'github' in running_pipelines %}Running...{% else %}Run GitHub{% endif %}
113
+ </button>
114
+ <button type="button" class="btn" id="btn-events"
115
+ {% if 'events' in running_pipelines %}disabled style="opacity:0.6"{% endif %}
116
+ onclick="triggerPipeline('events', this)">
117
+ {% if 'events' in running_pipelines %}Running...{% else %}Run Events{% endif %}
118
+ </button>
119
+ </div>
120
+ <script>
121
+ function triggerPipeline(domain, btn) {
122
+ btn.disabled = true;
123
+ btn.textContent = 'Starting...';
124
+ fetch('/run/' + domain, {method: 'POST'}).then(function() {
125
+ showToast(domain.charAt(0).toUpperCase() + domain.slice(1) + ' pipeline started', 'success');
126
+ btn.textContent = 'Running...';
127
+ btn.style.opacity = '0.6';
128
+ }).catch(function() {
129
+ showToast('Failed to start pipeline', 'error');
130
+ btn.disabled = false;
131
+ btn.textContent = 'Run ' + domain + ' Pipeline';
132
+ });
133
+ }
134
+ </script>
135
+ {% endblock %}
src/web/templates/events.html ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}Events — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <h1>Events</h1>
6
+ <div class="subtitle">{{ total }} events tracked</div>
7
+ </div>
8
+
9
+ {% if deadlines %}
10
+ <div class="event-section">
11
+ <div class="section-header">
12
+ <h2>Upcoming Deadlines</h2>
13
+ <span class="badge badge--purple">{{ deadlines | length }}</span>
14
+ </div>
15
+ {% for e in deadlines %}
16
+ <div class="event-card event-card--conference">
17
+ <div style="display:flex; justify-content:space-between; align-items:flex-start; gap:1rem">
18
+ <div style="min-width:0">
19
+ <div class="event-title">
20
+ {% if e.url %}<a href="{{ e.url }}">{{ e.title }}</a>{% else %}{{ e.title }}{% endif %}
21
+ </div>
22
+ <div class="event-meta">
23
+ {{ e.source }}
24
+ {% if e.event_date %}· <strong style="color:var(--amber)">Deadline: {{ e.event_date | format_date('medium') }}</strong>{% endif %}
25
+ </div>
26
+ {% if e.description %}<div class="event-desc">{{ e.description[:250] }}{% if e.description | length > 250 %}&hellip;{% endif %}</div>{% endif %}
27
+ </div>
28
+ {% if e.event_date %}
29
+ <div style="flex-shrink:0; text-align:right; font-family:var(--font-mono); font-size:0.8rem; color:var(--text-muted); white-space:nowrap">
30
+ {{ e.event_date | format_date }}
31
+ </div>
32
+ {% endif %}
33
+ </div>
34
+ </div>
35
+ {% endfor %}
36
+ </div>
37
+ {% endif %}
38
+
39
+ {% if releases %}
40
+ <div class="event-section">
41
+ <div class="section-header">
42
+ <h2>Notable Releases</h2>
43
+ <span class="badge badge--emerald">{{ releases | length }}</span>
44
+ </div>
45
+ {% for e in releases %}
46
+ <div class="event-card event-card--release">
47
+ <div class="event-title">
48
+ {% if e.url %}<a href="{{ e.url }}">{{ e.title }}</a>{% else %}{{ e.title }}{% endif %}
49
+ </div>
50
+ <div class="event-meta">
51
+ {{ e.source }}
52
+ {% if e.event_date %}· {{ e.event_date | format_date }}{% endif %}
53
+ {% if e.relevance_score %}· Relevance: {{ e.relevance_score }}{% endif %}
54
+ </div>
55
+ {% if e.description %}<div class="event-desc">{{ e.description[:200] }}{% if e.description | length > 200 %}&hellip;{% endif %}</div>{% endif %}
56
+ </div>
57
+ {% endfor %}
58
+ </div>
59
+ {% endif %}
60
+
61
+ {% if news %}
62
+ <div class="event-section">
63
+ <div class="section-header">
64
+ <h2>News</h2>
65
+ <span class="badge badge--accent">{{ news | length }}</span>
66
+ </div>
67
+ {% for e in news %}
68
+ <div class="event-card event-card--news">
69
+ <div class="event-title">
70
+ {% if e.url %}<a href="{{ e.url }}">{{ e.title }}</a>{% else %}{{ e.title }}{% endif %}
71
+ </div>
72
+ <div class="event-meta">
73
+ {{ e.source }}
74
+ {% if e.event_date %}· {{ e.event_date | format_date('medium') }}{% endif %}
75
+ </div>
76
+ {% if e.description %}<div class="event-desc">{{ e.description[:200] }}{% if e.description | length > 200 %}&hellip;{% endif %}</div>{% endif %}
77
+ </div>
78
+ {% endfor %}
79
+ </div>
80
+ {% endif %}
81
+
82
+ {% if not deadlines and not releases and not news %}
83
+ <div class="empty-state">
84
+ <h2>No events yet</h2>
85
+ <p>Run the events pipeline to populate this page.</p>
86
+ <form method="post" action="/run/events" style="margin-top:1rem">
87
+ <button type="submit" class="btn btn-primary">Run Events Pipeline</button>
88
+ </form>
89
+ </div>
90
+ {% endif %}
91
+ {% endblock %}
src/web/templates/github.html ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}GitHub Projects — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <div style="display:flex; justify-content:space-between; align-items:flex-start; flex-wrap:wrap; gap:0.5rem">
6
+ <div>
7
+ <h1>GitHub Projects</h1>
8
+ <div class="subtitle">{{ total }} trending projects{% if run.date_start %} · {{ run.date_start }} to {{ run.date_end }}{% endif %}</div>
9
+ </div>
10
+ <button type="button" class="btn btn-sm btn-ghost" onclick="this.disabled=true;this.textContent='Running...';fetch('/run/github',{method:'POST'}).then(function(){showToast('GitHub pipeline started','success')}).catch(function(){showToast('Pipeline failed','error')})">Refresh Projects</button>
11
+ </div>
12
+ </div>
13
+
14
+ <div class="filter-bar">
15
+ <form hx-get="/github" hx-target="#gh-results" hx-push-url="true" hx-indicator="#page-loader">
16
+ <input type="search" name="search" value="{{ search or '' }}" placeholder="Search repos...">
17
+ {% if available_languages %}
18
+ <select name="language">
19
+ <option value="">All languages</option>
20
+ {% for lang in available_languages %}
21
+ <option value="{{ lang }}" {% if language == lang %}selected{% endif %}>{{ lang }}</option>
22
+ {% endfor %}
23
+ </select>
24
+ {% endif %}
25
+ <select name="domain">
26
+ <option value="">All domains</option>
27
+ <option value="aiml" {% if domain_filter == 'aiml' %}selected{% endif %}>AI/ML</option>
28
+ <option value="security" {% if domain_filter == 'security' %}selected{% endif %}>Security</option>
29
+ </select>
30
+ <select name="sort">
31
+ <option value="score" {% if not sort or sort == 'score' %}selected{% endif %}>Sort: Score</option>
32
+ <option value="stars" {% if sort == 'stars' %}selected{% endif %}>Sort: Stars</option>
33
+ <option value="forks" {% if sort == 'forks' %}selected{% endif %}>Sort: Forks</option>
34
+ <option value="name" {% if sort == 'name' %}selected{% endif %}>Sort: Name</option>
35
+ </select>
36
+ <button type="submit" class="btn btn-primary btn-sm">Filter</button>
37
+ </form>
38
+ </div>
39
+
40
+ <div id="gh-results">
41
+ {% include "partials/github_results.html" %}
42
+ </div>
43
+ {% endblock %}
src/web/templates/paper_detail.html ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}{{ paper.title }} — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="paper-detail">
5
+ <div style="display:flex; justify-content:space-between; align-items:center; flex-wrap:wrap; gap:0.5rem">
6
+ <a href="/papers/{{ domain }}" class="back-link">&larr; Back to {{ domain_label }} papers</a>
7
+ <div style="display:flex; gap:0.5rem; align-items:center">
8
+ {% set paper_id = paper.id %}
9
+ {% set user_signal = paper.user_signal if paper.user_signal is defined else None %}
10
+ {% include "partials/signal_buttons.html" %}
11
+ </div>
12
+ </div>
13
+
14
+ <h1>{{ paper.title }}</h1>
15
+
16
+ <div class="authors">
17
+ {% if paper.authors is string %}{{ paper.authors }}{% else %}{{ paper.authors | join(", ") }}{% endif %}
18
+ </div>
19
+
20
+ {% if paper.topics is iterable and paper.topics is not string and paper.topics | length > 0 %}
21
+ <div style="display:flex; gap:0.35rem; margin-bottom:1rem; flex-wrap:wrap">
22
+ {% for t in paper.topics %}
23
+ <span class="badge badge--accent">{{ t }}</span>
24
+ {% endfor %}
25
+ {% if paper.is_discovery is defined and paper.is_discovery %}
26
+ <span class="badge badge--discover">DISCOVER</span>
27
+ {% endif %}
28
+ </div>
29
+ {% endif %}
30
+
31
+ {% if paper.composite is not none %}
32
+ <div class="score-grid">
33
+ {% set axes = [
34
+ (axis_labels[0], paper.score_axis_1),
35
+ (axis_labels[1], paper.score_axis_2),
36
+ (axis_labels[2], paper.score_axis_3)
37
+ ] %}
38
+ {% for label, val in axes %}
39
+ {% set pct = ((val or 0) | float / 10 * 100) | round(0) | int %}
40
+ {% set level = 'high' if pct >= 65 else ('mid' if pct >= 40 else 'low') %}
41
+ <div class="score-item">
42
+ <div class="label">{{ label }}</div>
43
+ <div class="score-value score-{{ level }}">{{ val | default("&mdash;") }}<span class="max">/10</span></div>
44
+ <div class="score-track score-track--lg">
45
+ <div class="score-fill {{ level }}" style="width:{{ pct }}%"></div>
46
+ </div>
47
+ </div>
48
+ {% endfor %}
49
+ {% set comp_pct = ((paper.composite or 0) | float / 10 * 100) | round(0) | int %}
50
+ {% set comp_level = 'high' if comp_pct >= 65 else ('mid' if comp_pct >= 40 else 'low') %}
51
+ <div class="score-item score-item--composite">
52
+ <div class="label">Composite</div>
53
+ <div class="score-value score-{{ comp_level }}">{{ paper.composite }}<span class="max">/10</span></div>
54
+ <div class="score-track score-track--lg">
55
+ <div class="score-fill {{ comp_level }}" style="width:{{ comp_pct }}%"></div>
56
+ </div>
57
+ {% if paper.preference_boost is defined and paper.preference_boost != 0 %}
58
+ <div class="boost-detail">
59
+ <span class="boost-label">Preference boost:</span>
60
+ <span class="boost-value {% if paper.preference_boost > 0 %}boost-up{% else %}boost-down{% endif %}">{{ '%+.2f'|format(paper.preference_boost) }}</span>
61
+ <span class="boost-label">&rarr; Adjusted:</span>
62
+ <span class="boost-value">{{ paper.adjusted_score }}</span>
63
+ </div>
64
+ {% endif %}
65
+ </div>
66
+ </div>
67
+
68
+ {% if paper.boost_reasons is defined and paper.boost_reasons | length > 0 %}
69
+ <div class="pref-explanation">
70
+ <div class="pref-explanation__label">Preference Signals</div>
71
+ <div class="pref-explanation__reasons">
72
+ {% for reason in paper.boost_reasons %}
73
+ <span class="pref-reason">{{ reason }}</span>
74
+ {% endfor %}
75
+ </div>
76
+ </div>
77
+ {% endif %}
78
+ {% endif %}
79
+
80
+ {% if paper.summary %}
81
+ <div class="paper-summary">{{ paper.summary }}</div>
82
+ {% endif %}
83
+
84
+ {% if paper.s2_tldr %}
85
+ <div style="margin:0.75rem 0; padding:0.75rem 1rem; background:var(--bg); border-radius:var(--radius); border-left:3px solid var(--purple); font-size:0.88rem; color:var(--text-secondary)">
86
+ <span style="font-size:0.7rem; font-weight:600; text-transform:uppercase; letter-spacing:0.04em; color:var(--purple)">S2 TL;DR</span><br>
87
+ {{ paper.s2_tldr }}
88
+ </div>
89
+ {% endif %}
90
+
91
+ {% if paper.reasoning %}
92
+ <p class="paper-reasoning">{{ paper.reasoning }}</p>
93
+ {% endif %}
94
+
95
+ <div class="paper-links">
96
+ {% if paper.arxiv_url %}<a href="{{ paper.arxiv_url }}">arXiv</a>{% endif %}
97
+ {% if paper.pdf_url %}<a href="{{ paper.pdf_url }}">PDF</a>{% endif %}
98
+ {% if paper.code_url %}<a href="{{ paper.code_url }}">Code</a>{% endif %}
99
+ {% if paper.github_repo and paper.github_repo != paper.code_url %}<a href="{{ paper.github_repo }}">GitHub</a>{% endif %}
100
+ {% if paper.hf_models %}
101
+ {% for m in paper.hf_models[:3] %}
102
+ <a href="https://huggingface.co/{{ m.id if m is mapping else m }}">Model: {{ (m.id if m is mapping else m)[:30] }}</a>
103
+ {% endfor %}
104
+ {% endif %}
105
+ {% if paper.hf_datasets %}
106
+ {% for d in paper.hf_datasets[:2] %}
107
+ <a href="https://huggingface.co/datasets/{{ d.id if d is mapping else d }}">Dataset: {{ (d.id if d is mapping else d)[:30] }}</a>
108
+ {% endfor %}
109
+ {% endif %}
110
+ {% if paper.hf_spaces %}
111
+ {% for s in paper.hf_spaces[:2] %}
112
+ <a href="https://huggingface.co/spaces/{{ s.id if s is mapping else s }}">Space: {{ (s.id if s is mapping else s)[:30] }}</a>
113
+ {% endfor %}
114
+ {% endif %}
115
+ </div>
116
+
117
+ <div class="paper-abstract">
118
+ <strong>Abstract</strong><br><br>
119
+ {{ paper.abstract }}
120
+ </div>
121
+
122
+ {% if paper.categories %}
123
+ <div class="paper-meta">
124
+ <strong>Categories:</strong>
125
+ {% if paper.categories is string %}{{ paper.categories }}{% else %}{{ paper.categories | join(", ") }}{% endif %}
126
+ </div>
127
+ {% endif %}
128
+
129
+ {% if paper.comment %}
130
+ <div class="paper-meta" style="margin-top:0.4rem">
131
+ <strong>Comment:</strong> {{ paper.comment }}
132
+ </div>
133
+ {% endif %}
134
+
135
+ {# ── Connected Papers ── #}
136
+ {% if connections and (connections.references or connections.recommendations) %}
137
+ <div class="connected-papers">
138
+ <div class="section-header" style="margin-top:2rem">
139
+ <h2>Connected Papers</h2>
140
+ </div>
141
+
142
+ {% if connections.references %}
143
+ <div class="connection-group">
144
+ <div class="connection-group__label">References <span class="badge badge--accent">{{ connections.references | length }}</span></div>
145
+ <div class="connection-list">
146
+ {% for c in connections.references[:20] %}
147
+ <div class="connection-item{% if c.in_db_paper_id %} connection-item--in-db{% endif %}">
148
+ <span class="connection-title">
149
+ {% if c.in_db_paper_id %}
150
+ <a href="/papers/{{ domain }}/{{ c.in_db_paper_id }}">{{ c.connected_title }}</a>
151
+ {% elif c.connected_arxiv_id %}
152
+ <a href="https://arxiv.org/abs/{{ c.connected_arxiv_id }}">{{ c.connected_title }}</a>
153
+ {% elif c.connected_s2_id %}
154
+ <a href="https://api.semanticscholar.org/{{ c.connected_s2_id }}">{{ c.connected_title }}</a>
155
+ {% else %}
156
+ {{ c.connected_title }}
157
+ {% endif %}
158
+ </span>
159
+ {% if c.connected_year %}<span class="connection-year">{{ c.connected_year }}</span>{% endif %}
160
+ {% if c.in_db_paper_id %}<span class="badge badge--emerald" style="font-size:0.6rem">IN DB</span>{% endif %}
161
+ </div>
162
+ {% endfor %}
163
+ </div>
164
+ </div>
165
+ {% endif %}
166
+
167
+ {% if connections.recommendations %}
168
+ <div class="connection-group">
169
+ <div class="connection-group__label">Similar Papers <span class="badge badge--purple">{{ connections.recommendations | length }}</span></div>
170
+ <div class="connection-list">
171
+ {% for c in connections.recommendations[:15] %}
172
+ <div class="connection-item{% if c.in_db_paper_id %} connection-item--in-db{% endif %}">
173
+ <span class="connection-title">
174
+ {% if c.in_db_paper_id %}
175
+ <a href="/papers/{{ domain }}/{{ c.in_db_paper_id }}">{{ c.connected_title }}</a>
176
+ {% elif c.connected_arxiv_id %}
177
+ <a href="https://arxiv.org/abs/{{ c.connected_arxiv_id }}">{{ c.connected_title }}</a>
178
+ {% elif c.connected_s2_id %}
179
+ <a href="https://api.semanticscholar.org/{{ c.connected_s2_id }}">{{ c.connected_title }}</a>
180
+ {% else %}
181
+ {{ c.connected_title }}
182
+ {% endif %}
183
+ </span>
184
+ {% if c.connected_year %}<span class="connection-year">{{ c.connected_year }}</span>{% endif %}
185
+ {% if c.in_db_paper_id %}<span class="badge badge--emerald" style="font-size:0.6rem">IN DB</span>{% endif %}
186
+ </div>
187
+ {% endfor %}
188
+ </div>
189
+ </div>
190
+ {% endif %}
191
+ </div>
192
+ {% endif %}
193
+
194
+ <div class="context-block">
195
+ <div class="context-label">Context for Claude Code</div>
196
+ <pre>Paper: {{ paper.title }}
197
+ arXiv: {{ paper.arxiv_id }}
198
+ Score: {{ paper.composite }}/10
199
+ Summary: {{ paper.summary }}
200
+ {% if paper.code_url %}Code: {{ paper.code_url }}{% endif %}
201
+
202
+ Tell me more about this paper's approach and results.</pre>
203
+ </div>
204
+ </div>
205
+ {% endblock %}
src/web/templates/papers.html ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}{{ domain_label }} Papers — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <div style="display:flex; justify-content:space-between; align-items:flex-start; flex-wrap:wrap; gap:0.5rem">
6
+ <div>
7
+ <h1>{{ domain_label }} Papers</h1>
8
+ <div class="subtitle">{{ total }} papers scored{% if run.date_start %} · {{ run.date_start }} to {{ run.date_end }}{% endif %}</div>
9
+ </div>
10
+ <button type="button" class="btn btn-sm btn-ghost" onclick="this.disabled=true;this.textContent='Enriching...';fetch('/run/enrich/{{ domain }}',{method:'POST'}).then(function(){showToast('S2 enrichment started','success')}).catch(function(){showToast('Enrichment failed','error')})">Enrich with S2</button>
11
+ </div>
12
+ </div>
13
+
14
+ <div class="filter-bar">
15
+ <form hx-get="/papers/{{ domain }}" hx-target="#paper-results" hx-push-url="true" hx-indicator="#page-loader">
16
+ <input type="search" name="search" value="{{ search or '' }}" placeholder="Search papers...">
17
+ <label>
18
+ Min score
19
+ <input type="number" name="min_score" value="{{ min_score or '' }}" min="0" max="10" step="0.5">
20
+ </label>
21
+ <label>
22
+ <input type="checkbox" name="has_code" value="1" {% if has_code %}checked{% endif %}>
23
+ Has code
24
+ </label>
25
+ {% if available_topics %}
26
+ <select name="topic">
27
+ <option value="">All topics</option>
28
+ {% for t in available_topics %}
29
+ <option value="{{ t }}" {% if topic == t %}selected{% endif %}>{{ t }}</option>
30
+ {% endfor %}
31
+ </select>
32
+ {% endif %}
33
+ <select name="sort">
34
+ <option value="adjusted" {% if sort == 'adjusted' or (not sort and has_preferences) %}selected{% endif %}>Sort: Personalized</option>
35
+ <option value="score" {% if sort == 'score' or (not sort and not has_preferences) %}selected{% endif %}>Sort: Score</option>
36
+ <option value="date" {% if sort == 'date' %}selected{% endif %}>Sort: Date</option>
37
+ <option value="axis1" {% if sort == 'axis1' %}selected{% endif %}>Sort: {{ axis_labels[0] }}</option>
38
+ <option value="axis2" {% if sort == 'axis2' %}selected{% endif %}>Sort: {{ axis_labels[1] }}</option>
39
+ <option value="axis3" {% if sort == 'axis3' %}selected{% endif %}>Sort: {{ axis_labels[2] }}</option>
40
+ <option value="title" {% if sort == 'title' %}selected{% endif %}>Sort: Title</option>
41
+ </select>
42
+ <button type="submit" class="btn btn-primary btn-sm">Filter</button>
43
+ </form>
44
+ </div>
45
+
46
+ <div id="paper-results">
47
+ {% include "partials/papers_results.html" %}
48
+ </div>
49
+ {% endblock %}
src/web/templates/partials/github_results.html ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% if projects %}
2
+ <table class="paper-table">
3
+ <thead>
4
+ <tr>
5
+ <th class="col-rank">#</th>
6
+ <th>Repository</th>
7
+ <th class="col-score">Stars</th>
8
+ <th class="col-score">Forks</th>
9
+ <th class="col-score">PRs</th>
10
+ <th class="col-code">Lang</th>
11
+ <th class="col-code">Domain</th>
12
+ </tr>
13
+ </thead>
14
+ <tbody>
15
+ {% for p in projects %}
16
+ {% set rank = offset + loop.index %}
17
+ <tr>
18
+ <td class="col-rank">{{ rank }}</td>
19
+ <td>
20
+ <div class="paper-title">
21
+ <a href="{{ p.url }}" target="_blank" rel="noopener">{{ p.repo_name }}</a>
22
+ </div>
23
+ {% if p.description %}
24
+ <div class="paper-summary">{{ p.description[:200] }}{% if p.description | length > 200 %}&hellip;{% endif %}</div>
25
+ {% endif %}
26
+ {% if p.collection_names %}
27
+ <div style="margin-top:0.25rem">
28
+ {% for tag in p.collection_names.split(',') %}
29
+ {% if tag.strip() %}
30
+ <span class="badge badge--accent" style="font-size:0.7rem">{{ tag.strip() }}</span>
31
+ {% endif %}
32
+ {% endfor %}
33
+ </div>
34
+ {% endif %}
35
+ </td>
36
+ <td class="col-score">
37
+ <span style="color:var(--amber); font-weight:600">{{ p.stars }}</span>
38
+ </td>
39
+ <td class="col-score">{{ p.forks }}</td>
40
+ <td class="col-score">{{ p.pull_requests }}</td>
41
+ <td class="col-code">
42
+ {% if p.language %}
43
+ <span class="badge" style="font-size:0.7rem">{{ p.language }}</span>
44
+ {% endif %}
45
+ </td>
46
+ <td class="col-code">
47
+ {% if p.domain == 'aiml' %}
48
+ <span class="badge badge--accent" style="font-size:0.7rem">AI/ML</span>
49
+ {% elif p.domain == 'security' %}
50
+ <span class="badge badge--red" style="font-size:0.7rem">Security</span>
51
+ {% endif %}
52
+ </td>
53
+ </tr>
54
+ {% endfor %}
55
+ </tbody>
56
+ </table>
57
+
58
+ {% set filter_qs %}{% if search %}&search={{ search | urlencode }}{% endif %}{% if language %}&language={{ language | urlencode }}{% endif %}{% if domain_filter %}&domain={{ domain_filter | urlencode }}{% endif %}{% if sort %}&sort={{ sort | urlencode }}{% endif %}{% endset %}
59
+
60
+ {% if total > limit %}
61
+ <div class="pagination">
62
+ {% if offset > 0 %}
63
+ <a href="/github?offset={{ offset - limit }}&limit={{ limit }}{{ filter_qs }}"
64
+ hx-get="/github?offset={{ offset - limit }}&limit={{ limit }}{{ filter_qs }}"
65
+ hx-target="#gh-results" hx-push-url="true" hx-indicator="#page-loader"
66
+ class="btn btn-sm">&larr; Prev</a>
67
+ {% endif %}
68
+ <span class="page-info">{{ offset + 1 }}&ndash;{{ [offset + limit, total] | min }} of {{ total }}</span>
69
+ {% if offset + limit < total %}
70
+ <a href="/github?offset={{ offset + limit }}&limit={{ limit }}{{ filter_qs }}"
71
+ hx-get="/github?offset={{ offset + limit }}&limit={{ limit }}{{ filter_qs }}"
72
+ hx-target="#gh-results" hx-push-url="true" hx-indicator="#page-loader"
73
+ class="btn btn-sm">Next &rarr;</a>
74
+ {% endif %}
75
+ </div>
76
+ {% endif %}
77
+
78
+ {% else %}
79
+ <div class="empty-state">
80
+ <h2>No projects found</h2>
81
+ <p>{% if search or language or domain_filter %}Try adjusting your filters.{% else %}Run the GitHub pipeline to discover trending projects.{% endif %}</p>
82
+ </div>
83
+ {% endif %}
src/web/templates/partials/paper_card.html ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% set pct = ((p.composite or 0) | float / 10 * 100) | round(0) | int %}
2
+ {% set level = 'high' if pct >= 65 else ('mid' if pct >= 40 else 'low') %}
3
+ <div class="paper-card">
4
+ <div class="card-top">
5
+ <div style="min-width:0">
6
+ <span class="rank">#{{ rank if rank is defined else "" }}</span>
7
+ <div class="title"><a href="/papers/{{ p.domain }}/{{ p.id }}">{{ p.title }}</a></div>
8
+ <div class="meta">
9
+ {% if p.code_url or p.github_repo %}<span class="badge-code">CODE</span>{% endif %}
10
+ {% if p.hf_models %}<span class="badge-hf">HF</span>{% endif %}
11
+ {% if p.source == "both" %}<span class="badge-source">HF+arXiv</span>{% endif %}
12
+ {% if p.is_discovery is defined and p.is_discovery %}<span class="badge badge--discover">DISCOVER</span>{% endif %}
13
+ <span>{{ p.published[:10] if p.published else "" }}</span>
14
+ </div>
15
+ </div>
16
+ <div style="display:flex; align-items:center; gap:0.35rem">
17
+ {% if p.preference_boost is defined and p.preference_boost > 0.1 %}
18
+ <span class="boost-pip boost-pip--up" title="Preference boost: {{ '%+.1f'|format(p.preference_boost) }}">&#9650;</span>
19
+ {% elif p.preference_boost is defined and p.preference_boost < -0.1 %}
20
+ <span class="boost-pip boost-pip--down" title="Preference penalty: {{ '%+.1f'|format(p.preference_boost) }}">&#9660;</span>
21
+ {% endif %}
22
+ <div class="score-badge {{ level }}">{{ p.composite }}</div>
23
+ </div>
24
+ </div>
25
+ {% if p.summary %}<div class="summary-text">{{ p.summary[:200] }}{% if p.summary | length > 200 %}&hellip;{% endif %}</div>{% endif %}
26
+ <div class="score-mini-track">
27
+ <div class="score-mini-fill {{ level }}" style="width:{{ pct }}%"></div>
28
+ </div>
29
+ </div>
src/web/templates/partials/paper_row.html ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% set pct = ((p.composite or 0) | float / 10 * 100) | round(0) | int %}
2
+ {% set level = 'high' if pct >= 65 else ('mid' if pct >= 40 else 'low') %}
3
+ <tr>
4
+ <td class="col-rank">{{ rank if rank is defined else "" }}</td>
5
+ <td>
6
+ <a href="/papers/{{ p.domain }}/{{ p.id }}" class="paper-title-link">
7
+ {{ p.title[:80] }}{% if p.title | length > 80 %}&hellip;{% endif %}
8
+ </a>
9
+ {% if p.topics is iterable and p.topics is not string and p.topics | length > 0 %}
10
+ <div style="margin-top:2px">
11
+ {% for t in p.topics[:2] %}
12
+ <span class="badge badge--accent" style="font-size:0.55rem; padding:0.08rem 0.35rem">{{ t }}</span>
13
+ {% endfor %}
14
+ </div>
15
+ {% endif %}
16
+ {% if p.is_discovery is defined and p.is_discovery %}
17
+ <span class="badge badge--discover">DISCOVER</span>
18
+ {% endif %}
19
+ </td>
20
+ <td class="col-score composite score-{{ level }}">
21
+ {% if p.preference_boost is defined and p.preference_boost != 0 %}
22
+ <span title="Adjusted: {{ p.adjusted_score }} (raw {{ p.composite }}{{ ' %+.1f'|format(p.preference_boost) }})">
23
+ {{ p.adjusted_score }}
24
+ {% if p.preference_boost > 0 %}<span class="boost-arrow boost-up">&#9650;</span>{% elif p.preference_boost < 0 %}<span class="boost-arrow boost-down">&#9660;</span>{% endif %}
25
+ </span>
26
+ <div class="score-raw">({{ p.composite }})</div>
27
+ {% else %}
28
+ {{ p.composite }}
29
+ {% endif %}
30
+ </td>
31
+ <td class="col-score">{{ p.score_axis_1 | default("&mdash;") }}</td>
32
+ <td class="col-score">{{ p.score_axis_2 | default("&mdash;") }}</td>
33
+ <td class="col-score">{{ p.score_axis_3 | default("&mdash;") }}</td>
34
+ <td class="col-code">{% if p.code_url or p.github_repo %}<span class="code-check">&#10003;</span>{% else %}<span class="code-dash">&mdash;</span>{% endif %}</td>
35
+ <td class="col-signals">
36
+ {% set paper_id = p.id %}
37
+ {% set user_signal = p.user_signal if p.user_signal is defined else None %}
38
+ {% include "partials/signal_buttons.html" %}
39
+ </td>
40
+ <td class="col-summary">{{ (p.summary or "")[:100] }}{% if (p.summary or "") | length > 100 %}&hellip;{% endif %}</td>
41
+ </tr>
src/web/templates/partials/papers_results.html ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% if papers %}
2
+ <table class="paper-table">
3
+ <thead>
4
+ <tr>
5
+ <th class="col-rank">#</th>
6
+ <th>Title</th>
7
+ <th class="col-score">Score</th>
8
+ <th class="col-score" title="{{ axis_labels[0] }}">{{ abbreviate_label(axis_labels[0]) }}</th>
9
+ <th class="col-score" title="{{ axis_labels[1] }}">{{ abbreviate_label(axis_labels[1]) }}</th>
10
+ <th class="col-score" title="{{ axis_labels[2] }}">{{ abbreviate_label(axis_labels[2]) }}</th>
11
+ <th class="col-code">Code</th>
12
+ <th class="col-signals-header">Rate</th>
13
+ <th class="col-summary">Summary</th>
14
+ </tr>
15
+ </thead>
16
+ <tbody>
17
+ {% for p in papers %}
18
+ {% set rank = offset + loop.index %}
19
+ {% include "partials/paper_row.html" %}
20
+ {% endfor %}
21
+ </tbody>
22
+ </table>
23
+
24
+ {% if not has_preferences is defined or not has_preferences %}
25
+ <div class="cold-start-hint">
26
+ Rate papers to personalize your feed &mdash; use the arrows to tell the system what you like.
27
+ </div>
28
+ {% endif %}
29
+
30
+ {% set filter_qs %}{% if search %}&search={{ search | urlencode }}{% endif %}{% if min_score %}&min_score={{ min_score }}{% endif %}{% if has_code %}&has_code=1{% endif %}{% if topic %}&topic={{ topic | urlencode }}{% endif %}{% if sort %}&sort={{ sort | urlencode }}{% endif %}{% endset %}
31
+
32
+ {% if total > limit %}
33
+ <div class="pagination">
34
+ {% if offset > 0 %}
35
+ <a href="/papers/{{ domain }}?offset={{ offset - limit }}&limit={{ limit }}{{ filter_qs }}"
36
+ hx-get="/papers/{{ domain }}?offset={{ offset - limit }}&limit={{ limit }}{{ filter_qs }}"
37
+ hx-target="#paper-results" hx-push-url="true" hx-indicator="#page-loader"
38
+ class="btn btn-sm">&larr; Prev</a>
39
+ {% endif %}
40
+ <span class="page-info">{{ offset + 1 }}&ndash;{{ [offset + limit, total] | min }} of {{ total }}</span>
41
+ {% if offset + limit < total %}
42
+ <a href="/papers/{{ domain }}?offset={{ offset + limit }}&limit={{ limit }}{{ filter_qs }}"
43
+ hx-get="/papers/{{ domain }}?offset={{ offset + limit }}&limit={{ limit }}{{ filter_qs }}"
44
+ hx-target="#paper-results" hx-push-url="true" hx-indicator="#page-loader"
45
+ class="btn btn-sm">Next &rarr;</a>
46
+ {% endif %}
47
+ </div>
48
+ {% endif %}
49
+
50
+ {% else %}
51
+ <div class="empty-state">
52
+ <h2>No papers found</h2>
53
+ <p>{% if search or min_score or has_code or topic %}Try adjusting your filters.{% else %}Run the {{ domain_label }} pipeline to get started.{% endif %}</p>
54
+ </div>
55
+ {% endif %}
src/web/templates/partials/signal_buttons.html ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div class="signal-buttons" id="signal-{{ paper_id }}">
2
+ <button class="signal-btn signal-btn--up{% if user_signal == 'upvote' %} active{% endif %}"
3
+ hx-post="/api/signal/{{ paper_id }}/upvote"
4
+ hx-target="#signal-{{ paper_id }}"
5
+ hx-swap="outerHTML"
6
+ title="More like this">&#9650;</button>
7
+ <button class="signal-btn signal-btn--down{% if user_signal == 'downvote' %} active{% endif %}"
8
+ hx-post="/api/signal/{{ paper_id }}/downvote"
9
+ hx-target="#signal-{{ paper_id }}"
10
+ hx-swap="outerHTML"
11
+ title="Less like this">&#9660;</button>
12
+ <button class="signal-btn signal-btn--dismiss{% if user_signal == 'dismiss' %} active{% endif %}"
13
+ hx-post="/api/signal/{{ paper_id }}/dismiss"
14
+ hx-target="#signal-{{ paper_id }}"
15
+ hx-swap="outerHTML"
16
+ title="Not interested">&times;</button>
17
+ </div>
src/web/templates/preferences.html ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}Preferences — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <div style="display:flex; justify-content:space-between; align-items:flex-start; flex-wrap:wrap; gap:0.5rem">
6
+ <div>
7
+ <h1>Preferences</h1>
8
+ <div class="subtitle">
9
+ {{ total_prefs }} learned preference{{ 's' if total_prefs != 1 else '' }}
10
+ {% if updated_at %} · Last updated {{ updated_at[:16] }}{% endif %}
11
+ </div>
12
+ </div>
13
+ <div style="display:flex; gap:0.5rem">
14
+ <button class="btn btn-sm" onclick="this.disabled=true;this.textContent='Recomputing...';fetch('/api/preferences/recompute',{method:'POST'}).then(function(){showToast('Preferences recomputed','success');setTimeout(function(){location.reload()},500)}).catch(function(){showToast('Failed','error')})">Recompute</button>
15
+ <button class="btn btn-sm" style="color:var(--red)" onclick="if(confirm('Reset all preferences and signal history?')){this.disabled=true;fetch('/api/preferences/reset',{method:'POST'}).then(function(){showToast('Preferences reset','success');setTimeout(function(){location.reload()},500)}).catch(function(){showToast('Failed','error')})}">Reset All</button>
16
+ </div>
17
+ </div>
18
+ </div>
19
+
20
+ {# Signal summary #}
21
+ <div class="stats-grid" style="grid-template-columns:repeat(5, 1fr); margin-bottom:2rem">
22
+ <div class="stat-card stat-card--green">
23
+ <div class="label">Saves</div>
24
+ <div class="value">{{ signal_counts.get('save', 0) }}</div>
25
+ </div>
26
+ <div class="stat-card stat-card--blue">
27
+ <div class="label">Upvotes</div>
28
+ <div class="value">{{ signal_counts.get('upvote', 0) }}</div>
29
+ </div>
30
+ <div class="stat-card stat-card--purple">
31
+ <div class="label">Views</div>
32
+ <div class="value">{{ signal_counts.get('view', 0) }}</div>
33
+ </div>
34
+ <div class="stat-card stat-card--red">
35
+ <div class="label">Downvotes</div>
36
+ <div class="value">{{ signal_counts.get('downvote', 0) }}</div>
37
+ </div>
38
+ <div class="stat-card" style="background:var(--bg-card)">
39
+ <div class="label">Dismissed</div>
40
+ <div class="value">{{ signal_counts.get('dismiss', 0) }}</div>
41
+ </div>
42
+ </div>
43
+
44
+ {% if total_prefs == 0 %}
45
+ <div class="empty-state">
46
+ <h2>No preferences yet</h2>
47
+ <p>Rate papers using the arrow buttons to build your preference profile. The system learns from saves, upvotes, downvotes, and dismissals.</p>
48
+ </div>
49
+ {% else %}
50
+
51
+ {# Preference groups #}
52
+ {% set pref_labels = {'topic': 'Topics', 'keyword': 'Keywords', 'category': 'Categories', 'author': 'Authors', 'axis_pref': 'Axis Preferences'} %}
53
+
54
+ <div class="pref-groups">
55
+ {% for prefix, items in grouped.items() %}
56
+ {% set label = pref_labels.get(prefix, prefix | capitalize) %}
57
+ <div class="pref-group">
58
+ <div class="section-header">
59
+ <h2>{{ label }}</h2>
60
+ <span class="badge badge--accent">{{ items | length }}</span>
61
+ </div>
62
+
63
+ <div class="pref-list">
64
+ {% for item in items[:20] %}
65
+ <div class="pref-item">
66
+ <span class="pref-item__name">{{ item.name }}</span>
67
+ <span class="pref-item__count" title="{{ item.count }} signal{{ 's' if item.count != 1 else '' }}">{{ item.count }}x</span>
68
+ <div class="pref-bar-container">
69
+ {% set abs_val = (item.value | abs * 100) | round(0) | int %}
70
+ {% if item.value > 0 %}
71
+ <div class="pref-bar pref-bar--positive" style="width:{{ abs_val }}%"></div>
72
+ {% else %}
73
+ <div class="pref-bar pref-bar--negative" style="width:{{ abs_val }}%"></div>
74
+ {% endif %}
75
+ </div>
76
+ <span class="pref-item__value {% if item.value > 0 %}pref-positive{% else %}pref-negative{% endif %}">{{ '%+.2f'|format(item.value) }}</span>
77
+ </div>
78
+ {% endfor %}
79
+ </div>
80
+ </div>
81
+ {% endfor %}
82
+ </div>
83
+
84
+ {% endif %}
85
+ {% endblock %}
src/web/templates/seed_preferences.html ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}Pick Papers You Like — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <h1>Pick Papers You Like</h1>
6
+ <div class="subtitle">Rate a few papers to personalize your feed. Click thumbs up or down, then hit Done.</div>
7
+ </div>
8
+
9
+ {% if papers %}
10
+ <div class="seed-grid" id="seed-grid">
11
+ {% for p in papers %}
12
+ <div class="seed-card" data-arxiv="{{ p.arxiv_id }}">
13
+ <div class="seed-card__body">
14
+ <div class="seed-card__domain">
15
+ {% if p.domain == 'aiml' %}
16
+ <span class="badge badge--accent">AI/ML</span>
17
+ {% elif p.domain == 'security' %}
18
+ <span class="badge badge--red">Security</span>
19
+ {% endif %}
20
+ </div>
21
+ <div class="seed-card__title">{{ p.title }}</div>
22
+ {% if p.summary %}
23
+ <div class="seed-card__summary">{{ p.summary[:150] }}{% if p.summary | length > 150 %}&hellip;{% endif %}</div>
24
+ {% endif %}
25
+ </div>
26
+ <div class="seed-card__actions">
27
+ <button type="button" class="seed-btn seed-btn--up" onclick="seedRate(this, 'upvote')" title="More like this">
28
+ <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><path d="M14 9V5a3 3 0 0 0-3-3l-4 9v11h11.28a2 2 0 0 0 2-1.7l1.38-9a2 2 0 0 0-2-2.3H14z"/><path d="M7 22H4a2 2 0 0 1-2-2v-7a2 2 0 0 1 2-2h3"/></svg>
29
+ </button>
30
+ <button type="button" class="seed-btn seed-btn--down" onclick="seedRate(this, 'downvote')" title="Less like this">
31
+ <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><path d="M10 15v4a3 3 0 0 0 3 3l4-9V2H5.72a2 2 0 0 0-2 1.7l-1.38 9a2 2 0 0 0 2 2.3H10z"/><path d="M17 2h3a2 2 0 0 1 2 2v7a2 2 0 0 1-2 2h-3"/></svg>
32
+ </button>
33
+ </div>
34
+ </div>
35
+ {% endfor %}
36
+ </div>
37
+
38
+ <div style="text-align:center; margin-top:2rem">
39
+ <span id="seed-count" style="color:var(--text-muted); font-size:0.85rem; margin-right:1rem">0 rated</span>
40
+ <button type="button" class="btn btn-primary" id="seed-done" onclick="seedDone()">Done</button>
41
+ </div>
42
+
43
+ <style>
44
+ .seed-grid {
45
+ display: grid;
46
+ grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
47
+ gap: 1rem;
48
+ }
49
+ .seed-card {
50
+ background: var(--bg-card);
51
+ border: 1px solid var(--border);
52
+ border-radius: var(--radius-lg);
53
+ padding: 1rem 1.25rem;
54
+ display: flex;
55
+ flex-direction: column;
56
+ justify-content: space-between;
57
+ transition: border-color 0.2s, box-shadow 0.2s;
58
+ animation: fadeSlideUp 0.35s ease-out both;
59
+ }
60
+ .seed-card.rated-up {
61
+ border-color: rgba(16, 185, 129, 0.4);
62
+ box-shadow: 0 0 12px rgba(16, 185, 129, 0.1);
63
+ }
64
+ .seed-card.rated-down {
65
+ border-color: rgba(239, 68, 68, 0.3);
66
+ opacity: 0.6;
67
+ }
68
+ .seed-card__title {
69
+ font-weight: 600;
70
+ font-size: 0.88rem;
71
+ line-height: 1.45;
72
+ margin: 0.35rem 0;
73
+ }
74
+ .seed-card__summary {
75
+ font-size: 0.8rem;
76
+ color: var(--text-muted);
77
+ line-height: 1.5;
78
+ margin-top: 0.25rem;
79
+ }
80
+ .seed-card__actions {
81
+ display: flex;
82
+ gap: 0.5rem;
83
+ margin-top: 0.75rem;
84
+ padding-top: 0.75rem;
85
+ border-top: 1px solid var(--border);
86
+ }
87
+ .seed-btn {
88
+ flex: 1;
89
+ display: flex;
90
+ align-items: center;
91
+ justify-content: center;
92
+ gap: 0.35rem;
93
+ padding: 0.45rem;
94
+ border-radius: var(--radius);
95
+ border: 1px solid var(--border);
96
+ background: transparent;
97
+ color: var(--text-muted);
98
+ cursor: pointer;
99
+ font-size: 0.8rem;
100
+ transition: all 0.15s;
101
+ }
102
+ .seed-btn:hover {
103
+ background: var(--bg-surface);
104
+ color: var(--text-secondary);
105
+ }
106
+ .seed-btn--up.active {
107
+ background: var(--emerald-glow);
108
+ color: var(--emerald);
109
+ border-color: rgba(16, 185, 129, 0.3);
110
+ }
111
+ .seed-btn--down.active {
112
+ background: var(--red-glow);
113
+ color: var(--red);
114
+ border-color: rgba(239, 68, 68, 0.3);
115
+ }
116
+ </style>
117
+
118
+ <script>
119
+ var seedRatings = {};
120
+
121
+ function seedRate(btn, action) {
122
+ var card = btn.closest('.seed-card');
123
+ var arxivId = card.dataset.arxiv;
124
+ var upBtn = card.querySelector('.seed-btn--up');
125
+ var downBtn = card.querySelector('.seed-btn--down');
126
+
127
+ // Toggle off if already active
128
+ if (seedRatings[arxivId] === action) {
129
+ delete seedRatings[arxivId];
130
+ upBtn.classList.remove('active');
131
+ downBtn.classList.remove('active');
132
+ card.classList.remove('rated-up', 'rated-down');
133
+ } else {
134
+ seedRatings[arxivId] = action;
135
+ upBtn.classList.toggle('active', action === 'upvote');
136
+ downBtn.classList.toggle('active', action === 'downvote');
137
+ card.classList.toggle('rated-up', action === 'upvote');
138
+ card.classList.toggle('rated-down', action === 'downvote');
139
+ }
140
+
141
+ document.getElementById('seed-count').textContent = Object.keys(seedRatings).length + ' rated';
142
+ }
143
+
144
+ function seedDone() {
145
+ var count = Object.keys(seedRatings).length;
146
+ if (count === 0) {
147
+ window.location.href = '/';
148
+ return;
149
+ }
150
+
151
+ var btn = document.getElementById('seed-done');
152
+ btn.disabled = true;
153
+ btn.textContent = 'Saving...';
154
+
155
+ fetch('/api/seed-preferences', {
156
+ method: 'POST',
157
+ headers: {'Content-Type': 'application/json'},
158
+ body: JSON.stringify({ratings: seedRatings})
159
+ })
160
+ .then(function(r) { return r.json(); })
161
+ .then(function(data) {
162
+ window.location.href = '/?toast=Preferences+initialized+from+' + data.count + '+ratings';
163
+ })
164
+ .catch(function() {
165
+ btn.disabled = false;
166
+ btn.textContent = 'Done';
167
+ alert('Error saving preferences');
168
+ });
169
+ }
170
+ </script>
171
+ {% else %}
172
+ <div class="empty-state">
173
+ <h2>No seed papers available</h2>
174
+ <p>Run a pipeline first to populate papers, then come back to seed your preferences.</p>
175
+ <a href="/" class="btn btn-primary" style="margin-top:1rem">Go to Dashboard</a>
176
+ </div>
177
+ {% endif %}
178
+ {% endblock %}
src/web/templates/setup.html ADDED
@@ -0,0 +1,596 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1">
6
+ <title>Setup — Research Intelligence</title>
7
+ <meta name="theme-color" content="#0b1121">
8
+ <link rel="icon" href="/static/favicon.svg" type="image/svg+xml">
9
+ <link rel="stylesheet" href="/static/style.css">
10
+ <style>
11
+ .setup-wrap {
12
+ min-height: 100vh;
13
+ display: flex;
14
+ align-items: center;
15
+ justify-content: center;
16
+ padding: 2rem 1rem;
17
+ }
18
+ .setup-card {
19
+ background: var(--bg-card);
20
+ border: 1px solid var(--border);
21
+ border-radius: var(--radius-xl);
22
+ max-width: 640px;
23
+ width: 100%;
24
+ padding: 2.5rem;
25
+ box-shadow: var(--shadow-lg);
26
+ animation: fadeSlideUp 0.5s ease-out both;
27
+ }
28
+ .setup-logo {
29
+ display: flex;
30
+ align-items: center;
31
+ gap: 10px;
32
+ margin-bottom: 2rem;
33
+ }
34
+ .setup-logo .logo-dot {
35
+ width: 10px;
36
+ height: 10px;
37
+ }
38
+ .setup-logo span {
39
+ font-family: var(--font-display);
40
+ font-size: 1.3rem;
41
+ font-weight: 700;
42
+ letter-spacing: -0.02em;
43
+ }
44
+ .setup-step {
45
+ display: none;
46
+ }
47
+ .setup-step.active {
48
+ display: block;
49
+ animation: fadeSlideUp 0.35s ease-out both;
50
+ }
51
+ .setup-step h2 {
52
+ font-family: var(--font-display);
53
+ font-size: 1.35rem;
54
+ font-weight: 700;
55
+ letter-spacing: -0.03em;
56
+ margin-bottom: 0.5rem;
57
+ }
58
+ .setup-step .step-desc {
59
+ color: var(--text-muted);
60
+ font-size: 0.9rem;
61
+ margin-bottom: 1.5rem;
62
+ line-height: 1.6;
63
+ }
64
+ .setup-field {
65
+ margin-bottom: 1.25rem;
66
+ }
67
+ .setup-field label {
68
+ display: block;
69
+ font-size: 0.8rem;
70
+ font-weight: 600;
71
+ color: var(--text-secondary);
72
+ margin-bottom: 0.4rem;
73
+ text-transform: uppercase;
74
+ letter-spacing: 0.04em;
75
+ }
76
+ .setup-field input[type="text"],
77
+ .setup-field input[type="password"],
78
+ .setup-field select {
79
+ width: 100%;
80
+ background: var(--bg);
81
+ border: 1px solid var(--border-strong);
82
+ border-radius: var(--radius);
83
+ color: var(--text);
84
+ padding: 0.6rem 0.85rem;
85
+ font-size: 0.9rem;
86
+ font-family: var(--font-body);
87
+ transition: border-color 0.15s, box-shadow 0.15s;
88
+ }
89
+ .setup-field input:focus,
90
+ .setup-field select:focus {
91
+ outline: none;
92
+ border-color: var(--accent);
93
+ box-shadow: 0 0 0 2px var(--accent-muted);
94
+ }
95
+ .setup-field .hint {
96
+ font-size: 0.78rem;
97
+ color: var(--text-dim);
98
+ margin-top: 0.3rem;
99
+ }
100
+ .setup-toggle {
101
+ display: flex;
102
+ align-items: center;
103
+ justify-content: space-between;
104
+ padding: 0.85rem 1rem;
105
+ background: var(--bg);
106
+ border: 1px solid var(--border);
107
+ border-radius: var(--radius);
108
+ margin-bottom: 0.6rem;
109
+ cursor: pointer;
110
+ transition: border-color 0.15s;
111
+ }
112
+ .setup-toggle:hover {
113
+ border-color: var(--border-strong);
114
+ }
115
+ .setup-toggle .toggle-label {
116
+ font-weight: 600;
117
+ font-size: 0.9rem;
118
+ }
119
+ .setup-toggle .toggle-desc {
120
+ font-size: 0.78rem;
121
+ color: var(--text-muted);
122
+ margin-top: 0.15rem;
123
+ }
124
+ .toggle-switch {
125
+ position: relative;
126
+ width: 42px;
127
+ height: 24px;
128
+ flex-shrink: 0;
129
+ }
130
+ .toggle-switch input {
131
+ opacity: 0;
132
+ width: 0;
133
+ height: 0;
134
+ }
135
+ .toggle-switch .slider {
136
+ position: absolute;
137
+ inset: 0;
138
+ background: var(--bg-surface);
139
+ border-radius: 12px;
140
+ transition: background 0.2s;
141
+ cursor: pointer;
142
+ }
143
+ .toggle-switch .slider::after {
144
+ content: '';
145
+ position: absolute;
146
+ width: 18px;
147
+ height: 18px;
148
+ left: 3px;
149
+ top: 3px;
150
+ background: var(--text-muted);
151
+ border-radius: 50%;
152
+ transition: transform 0.2s, background 0.2s;
153
+ }
154
+ .toggle-switch input:checked + .slider {
155
+ background: var(--accent);
156
+ }
157
+ .toggle-switch input:checked + .slider::after {
158
+ transform: translateX(18px);
159
+ background: var(--bg-deep);
160
+ }
161
+ .setup-axes {
162
+ margin-top: 0.75rem;
163
+ padding: 0.85rem;
164
+ background: var(--bg);
165
+ border-radius: var(--radius);
166
+ border: 1px solid var(--border);
167
+ }
168
+ .setup-axes.hidden {
169
+ display: none;
170
+ }
171
+ .axis-row {
172
+ display: flex;
173
+ gap: 0.75rem;
174
+ align-items: center;
175
+ margin-bottom: 0.5rem;
176
+ }
177
+ .axis-row:last-child {
178
+ margin-bottom: 0;
179
+ }
180
+ .axis-row .axis-name {
181
+ flex: 1;
182
+ font-size: 0.84rem;
183
+ font-weight: 500;
184
+ }
185
+ .axis-row input[type="number"] {
186
+ width: 5rem;
187
+ background: var(--bg-card);
188
+ border: 1px solid var(--border);
189
+ border-radius: var(--radius);
190
+ color: var(--text);
191
+ padding: 0.35rem 0.5rem;
192
+ font-size: 0.84rem;
193
+ font-family: var(--font-mono);
194
+ text-align: center;
195
+ }
196
+ .axis-row input[type="number"]:focus {
197
+ outline: none;
198
+ border-color: var(--accent);
199
+ }
200
+ .setup-nav {
201
+ display: flex;
202
+ justify-content: space-between;
203
+ align-items: center;
204
+ margin-top: 2rem;
205
+ padding-top: 1.25rem;
206
+ border-top: 1px solid var(--border);
207
+ }
208
+ .step-dots {
209
+ display: flex;
210
+ gap: 6px;
211
+ }
212
+ .step-dot {
213
+ width: 8px;
214
+ height: 8px;
215
+ border-radius: 50%;
216
+ background: var(--bg-surface);
217
+ transition: background 0.2s, box-shadow 0.2s;
218
+ }
219
+ .step-dot.active {
220
+ background: var(--accent);
221
+ box-shadow: 0 0 6px var(--accent);
222
+ }
223
+ .step-dot.done {
224
+ background: var(--emerald);
225
+ }
226
+ .api-key-status {
227
+ display: inline-flex;
228
+ align-items: center;
229
+ gap: 0.35rem;
230
+ font-size: 0.82rem;
231
+ font-weight: 500;
232
+ padding: 0.3rem 0.7rem;
233
+ border-radius: var(--radius-full);
234
+ margin-top: 0.5rem;
235
+ }
236
+ .api-key-status.valid {
237
+ background: var(--emerald-glow);
238
+ color: var(--emerald);
239
+ }
240
+ .api-key-status.invalid {
241
+ background: var(--red-glow);
242
+ color: var(--red);
243
+ }
244
+ .api-key-status.checking {
245
+ background: var(--amber-glow);
246
+ color: var(--amber);
247
+ }
248
+ .setup-summary {
249
+ background: var(--bg);
250
+ border: 1px solid var(--border);
251
+ border-radius: var(--radius);
252
+ padding: 1rem;
253
+ }
254
+ .summary-item {
255
+ display: flex;
256
+ justify-content: space-between;
257
+ padding: 0.4rem 0;
258
+ font-size: 0.85rem;
259
+ }
260
+ .summary-item:not(:last-child) {
261
+ border-bottom: 1px solid var(--border);
262
+ }
263
+ .summary-item .label {
264
+ color: var(--text-muted);
265
+ }
266
+ .summary-item .value {
267
+ font-weight: 600;
268
+ color: var(--text);
269
+ }
270
+ </style>
271
+ </head>
272
+ <body>
273
+ <div class="setup-wrap">
274
+ <div class="setup-card">
275
+ <div class="setup-logo">
276
+ <span class="logo-dot"></span>
277
+ <span>Research Intelligence</span>
278
+ </div>
279
+
280
+ <!-- Step 1: Welcome -->
281
+ <div class="setup-step active" data-step="1">
282
+ <h2>Welcome</h2>
283
+ <p class="step-desc">
284
+ Research Intelligence monitors academic papers and GitHub projects,
285
+ scores them with AI, and learns your preferences over time.
286
+ This setup will configure your instance.
287
+ </p>
288
+ <div style="padding:1rem; background:var(--bg); border-radius:var(--radius); border-left:3px solid var(--accent); font-size:0.85rem; color:var(--text-secondary); line-height:1.65">
289
+ You'll configure:<br>
290
+ &bull; API key for AI scoring<br>
291
+ &bull; Which research domains to monitor<br>
292
+ &bull; GitHub project tracking<br>
293
+ &bull; Pipeline schedule
294
+ </div>
295
+ </div>
296
+
297
+ <!-- Step 2: API Key -->
298
+ <div class="setup-step" data-step="2">
299
+ <h2>API Key</h2>
300
+ <p class="step-desc">
301
+ An Anthropic API key is required for paper scoring.
302
+ It will be stored in a local <code>.env</code> file, not in the config.
303
+ </p>
304
+ <div class="setup-field">
305
+ <label for="api-key">Anthropic API Key</label>
306
+ <input type="password" id="api-key" name="api_key" placeholder="sk-ant-..." autocomplete="off">
307
+ <div class="hint">Get one at <a href="https://console.anthropic.com/" target="_blank">console.anthropic.com</a></div>
308
+ </div>
309
+ <div id="api-key-result"></div>
310
+ <button type="button" class="btn btn-sm" onclick="validateApiKey()" id="validate-btn">Validate Key</button>
311
+ </div>
312
+
313
+ <!-- Step 3: Domains -->
314
+ <div class="setup-step" data-step="3">
315
+ <h2>Research Domains</h2>
316
+ <p class="step-desc">Choose which research areas to monitor.</p>
317
+
318
+ <div class="setup-toggle" onclick="this.querySelector('input').click()">
319
+ <div>
320
+ <div class="toggle-label">AI / ML</div>
321
+ <div class="toggle-desc">Papers from arXiv + HuggingFace trending</div>
322
+ </div>
323
+ <label class="toggle-switch" onclick="event.stopPropagation()">
324
+ <input type="checkbox" id="domain-aiml" checked onchange="toggleAxes('aiml', this.checked)">
325
+ <span class="slider"></span>
326
+ </label>
327
+ </div>
328
+ <div class="setup-axes" id="axes-aiml">
329
+ <div style="font-size:0.72rem; font-weight:600; text-transform:uppercase; letter-spacing:0.04em; color:var(--text-muted); margin-bottom:0.5rem">Scoring Weights</div>
330
+ <div class="axis-row">
331
+ <span class="axis-name">Code & Weights</span>
332
+ <input type="number" id="aiml-w1" value="30" min="0" max="100" step="5">
333
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
334
+ </div>
335
+ <div class="axis-row">
336
+ <span class="axis-name">Novelty</span>
337
+ <input type="number" id="aiml-w2" value="35" min="0" max="100" step="5">
338
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
339
+ </div>
340
+ <div class="axis-row">
341
+ <span class="axis-name">Practical Applicability</span>
342
+ <input type="number" id="aiml-w3" value="35" min="0" max="100" step="5">
343
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
344
+ </div>
345
+ </div>
346
+
347
+ <div class="setup-toggle" onclick="this.querySelector('input').click()" style="margin-top:0.75rem">
348
+ <div>
349
+ <div class="toggle-label">Security</div>
350
+ <div class="toggle-desc">Security research from arXiv cs.CR</div>
351
+ </div>
352
+ <label class="toggle-switch" onclick="event.stopPropagation()">
353
+ <input type="checkbox" id="domain-security" checked onchange="toggleAxes('security', this.checked)">
354
+ <span class="slider"></span>
355
+ </label>
356
+ </div>
357
+ <div class="setup-axes" id="axes-security">
358
+ <div style="font-size:0.72rem; font-weight:600; text-transform:uppercase; letter-spacing:0.04em; color:var(--text-muted); margin-bottom:0.5rem">Scoring Weights</div>
359
+ <div class="axis-row">
360
+ <span class="axis-name">Has Code / PoC</span>
361
+ <input type="number" id="sec-w1" value="25" min="0" max="100" step="5">
362
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
363
+ </div>
364
+ <div class="axis-row">
365
+ <span class="axis-name">Novel Attack Surface</span>
366
+ <input type="number" id="sec-w2" value="40" min="0" max="100" step="5">
367
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
368
+ </div>
369
+ <div class="axis-row">
370
+ <span class="axis-name">Real-World Impact</span>
371
+ <input type="number" id="sec-w3" value="35" min="0" max="100" step="5">
372
+ <span style="font-size:0.78rem; color:var(--text-dim)">%</span>
373
+ </div>
374
+ </div>
375
+ </div>
376
+
377
+ <!-- Step 4: GitHub -->
378
+ <div class="setup-step" data-step="4">
379
+ <h2>GitHub Monitoring</h2>
380
+ <p class="step-desc">Track trending open-source projects via OSSInsight collections.</p>
381
+ <div class="setup-toggle" onclick="this.querySelector('input').click()">
382
+ <div>
383
+ <div class="toggle-label">Enable GitHub tracking</div>
384
+ <div class="toggle-desc">Monitor trending repos in AI/ML and Security</div>
385
+ </div>
386
+ <label class="toggle-switch" onclick="event.stopPropagation()">
387
+ <input type="checkbox" id="github-enabled" checked>
388
+ <span class="slider"></span>
389
+ </label>
390
+ </div>
391
+ </div>
392
+
393
+ <!-- Step 5: Schedule -->
394
+ <div class="setup-step" data-step="5">
395
+ <h2>Schedule</h2>
396
+ <p class="step-desc">How often should pipelines run automatically?</p>
397
+ <div class="setup-field">
398
+ <label for="schedule">Frequency</label>
399
+ <select id="schedule">
400
+ <option value="weekly" selected>Weekly (Sunday night)</option>
401
+ <option value="daily">Daily (midnight UTC)</option>
402
+ <option value="manual">Manual only</option>
403
+ </select>
404
+ <div class="hint">You can always trigger runs manually from the dashboard.</div>
405
+ </div>
406
+ </div>
407
+
408
+ <!-- Step 6: Review -->
409
+ <div class="setup-step" data-step="6">
410
+ <h2>Review & Save</h2>
411
+ <p class="step-desc">Here's your configuration. Click Save to get started.</p>
412
+ <div class="setup-summary" id="setup-summary"></div>
413
+ </div>
414
+
415
+ <div class="setup-nav">
416
+ <div class="step-dots" id="step-dots"></div>
417
+ <div style="display:flex; gap:0.5rem">
418
+ <button type="button" class="btn btn-sm" id="btn-prev" onclick="prevStep()" style="display:none">Back</button>
419
+ <button type="button" class="btn btn-primary btn-sm" id="btn-next" onclick="nextStep()">Get Started</button>
420
+ </div>
421
+ </div>
422
+ </div>
423
+ </div>
424
+
425
+ <script>
426
+ var currentStep = 1;
427
+ var totalSteps = 6;
428
+ var apiKeyValid = false;
429
+
430
+ function initDots() {
431
+ var dots = document.getElementById('step-dots');
432
+ dots.innerHTML = '';
433
+ for (var i = 1; i <= totalSteps; i++) {
434
+ var dot = document.createElement('div');
435
+ dot.className = 'step-dot' + (i === 1 ? ' active' : '');
436
+ dot.dataset.step = i;
437
+ dots.appendChild(dot);
438
+ }
439
+ }
440
+ initDots();
441
+
442
+ function showStep(n) {
443
+ var steps = document.querySelectorAll('.setup-step');
444
+ steps.forEach(function(s) { s.classList.remove('active'); });
445
+ var target = document.querySelector('.setup-step[data-step="' + n + '"]');
446
+ if (target) target.classList.add('active');
447
+
448
+ var dots = document.querySelectorAll('.step-dot');
449
+ dots.forEach(function(d) {
450
+ var s = parseInt(d.dataset.step);
451
+ d.className = 'step-dot' + (s === n ? ' active' : (s < n ? ' done' : ''));
452
+ });
453
+
454
+ document.getElementById('btn-prev').style.display = n > 1 ? '' : 'none';
455
+ var nextBtn = document.getElementById('btn-next');
456
+ if (n === totalSteps) {
457
+ nextBtn.textContent = 'Save & Start';
458
+ buildSummary();
459
+ } else if (n === 1) {
460
+ nextBtn.textContent = 'Get Started';
461
+ } else {
462
+ nextBtn.textContent = 'Next';
463
+ }
464
+ }
465
+
466
+ function nextStep() {
467
+ if (currentStep === totalSteps) {
468
+ saveConfig();
469
+ return;
470
+ }
471
+ currentStep++;
472
+ showStep(currentStep);
473
+ }
474
+
475
+ function prevStep() {
476
+ if (currentStep > 1) {
477
+ currentStep--;
478
+ showStep(currentStep);
479
+ }
480
+ }
481
+
482
+ function toggleAxes(domain, enabled) {
483
+ var el = document.getElementById('axes-' + domain);
484
+ if (enabled) {
485
+ el.classList.remove('hidden');
486
+ } else {
487
+ el.classList.add('hidden');
488
+ }
489
+ }
490
+
491
+ function validateApiKey() {
492
+ var key = document.getElementById('api-key').value.trim();
493
+ if (!key) return;
494
+ var result = document.getElementById('api-key-result');
495
+ var btn = document.getElementById('validate-btn');
496
+ result.innerHTML = '<span class="api-key-status checking">Checking...</span>';
497
+ btn.disabled = true;
498
+
499
+ fetch('/api/setup/validate-key', {
500
+ method: 'POST',
501
+ headers: {'Content-Type': 'application/json'},
502
+ body: JSON.stringify({api_key: key})
503
+ })
504
+ .then(function(r) { return r.json(); })
505
+ .then(function(data) {
506
+ if (data.valid) {
507
+ result.innerHTML = '<span class="api-key-status valid">Valid</span>';
508
+ apiKeyValid = true;
509
+ } else {
510
+ result.innerHTML = '<span class="api-key-status invalid">Invalid — ' + (data.error || 'check your key') + '</span>';
511
+ apiKeyValid = false;
512
+ }
513
+ btn.disabled = false;
514
+ })
515
+ .catch(function() {
516
+ result.innerHTML = '<span class="api-key-status invalid">Connection error</span>';
517
+ btn.disabled = false;
518
+ });
519
+ }
520
+
521
+ function getScheduleCron() {
522
+ var v = document.getElementById('schedule').value;
523
+ if (v === 'daily') return '0 0 * * *';
524
+ if (v === 'manual') return '';
525
+ return '0 22 * * 0';
526
+ }
527
+
528
+ function buildSummary() {
529
+ var aiml = document.getElementById('domain-aiml').checked;
530
+ var sec = document.getElementById('domain-security').checked;
531
+ var gh = document.getElementById('github-enabled').checked;
532
+ var sched = document.getElementById('schedule').value;
533
+ var hasKey = document.getElementById('api-key').value.trim().length > 0;
534
+
535
+ var html = '';
536
+ html += '<div class="summary-item"><span class="label">API Key</span><span class="value">' + (hasKey ? (apiKeyValid ? 'Validated' : 'Set (unvalidated)') : 'Not set') + '</span></div>';
537
+ html += '<div class="summary-item"><span class="label">AI/ML</span><span class="value">' + (aiml ? 'Enabled' : 'Disabled') + '</span></div>';
538
+ html += '<div class="summary-item"><span class="label">Security</span><span class="value">' + (sec ? 'Enabled' : 'Disabled') + '</span></div>';
539
+ html += '<div class="summary-item"><span class="label">GitHub</span><span class="value">' + (gh ? 'Enabled' : 'Disabled') + '</span></div>';
540
+ html += '<div class="summary-item"><span class="label">Schedule</span><span class="value">' + sched.charAt(0).toUpperCase() + sched.slice(1) + '</span></div>';
541
+ document.getElementById('setup-summary').innerHTML = html;
542
+ }
543
+
544
+ function saveConfig() {
545
+ var btn = document.getElementById('btn-next');
546
+ btn.disabled = true;
547
+ btn.textContent = 'Saving...';
548
+
549
+ var payload = {
550
+ api_key: document.getElementById('api-key').value.trim(),
551
+ domains: {
552
+ aiml: {
553
+ enabled: document.getElementById('domain-aiml').checked,
554
+ scoring_weights: [
555
+ parseInt(document.getElementById('aiml-w1').value) / 100,
556
+ parseInt(document.getElementById('aiml-w2').value) / 100,
557
+ parseInt(document.getElementById('aiml-w3').value) / 100
558
+ ]
559
+ },
560
+ security: {
561
+ enabled: document.getElementById('domain-security').checked,
562
+ scoring_weights: [
563
+ parseInt(document.getElementById('sec-w1').value) / 100,
564
+ parseInt(document.getElementById('sec-w2').value) / 100,
565
+ parseInt(document.getElementById('sec-w3').value) / 100
566
+ ]
567
+ }
568
+ },
569
+ github: {enabled: document.getElementById('github-enabled').checked},
570
+ schedule: getScheduleCron()
571
+ };
572
+
573
+ fetch('/api/setup/save', {
574
+ method: 'POST',
575
+ headers: {'Content-Type': 'application/json'},
576
+ body: JSON.stringify(payload)
577
+ })
578
+ .then(function(r) { return r.json(); })
579
+ .then(function(data) {
580
+ if (data.status === 'ok') {
581
+ window.location.href = '/';
582
+ } else {
583
+ btn.disabled = false;
584
+ btn.textContent = 'Save & Start';
585
+ alert('Error: ' + (data.error || 'Unknown error'));
586
+ }
587
+ })
588
+ .catch(function() {
589
+ btn.disabled = false;
590
+ btn.textContent = 'Save & Start';
591
+ alert('Connection error');
592
+ });
593
+ }
594
+ </script>
595
+ </body>
596
+ </html>
src/web/templates/weeks.html ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% extends "base.html" %}
2
+ {% block title %}Archive — Research Intelligence{% endblock %}
3
+ {% block content %}
4
+ <div class="page-header">
5
+ <h1>Weekly Archives</h1>
6
+ <div class="subtitle">Past weekly reports and pipeline runs</div>
7
+ </div>
8
+
9
+ {% if archives %}
10
+ <div class="section-header">
11
+ <h2>Reports</h2>
12
+ </div>
13
+ <table class="paper-table" style="margin-bottom:2.5rem">
14
+ <thead>
15
+ <tr>
16
+ <th>Week</th>
17
+ <th>Domain</th>
18
+ <th>File</th>
19
+ </tr>
20
+ </thead>
21
+ <tbody>
22
+ {% for a in archives %}
23
+ <tr>
24
+ <td style="font-family:var(--font-mono); font-size:0.82rem">{{ a.date }}</td>
25
+ <td>
26
+ {% if a.domain == 'aiml' %}
27
+ <span class="badge badge--accent">AI/ML</span>
28
+ {% elif a.domain == 'security' %}
29
+ <span class="badge badge--red">SECURITY</span>
30
+ {% else %}
31
+ <span class="badge" style="background:var(--bg-surface); color:var(--text-muted)">{{ a.domain | upper }}</span>
32
+ {% endif %}
33
+ </td>
34
+ <td><a href="/weeks/{{ a.filename }}">{{ a.filename }}</a></td>
35
+ </tr>
36
+ {% endfor %}
37
+ </tbody>
38
+ </table>
39
+ {% else %}
40
+ <div class="empty-state" style="padding:2rem">
41
+ <h2>No archives yet</h2>
42
+ <p>Weekly reports will appear here after pipeline runs.</p>
43
+ </div>
44
+ {% endif %}
45
+
46
+ {% if runs %}
47
+ <div class="section-header">
48
+ <h2>Recent Runs</h2>
49
+ </div>
50
+ <table class="paper-table">
51
+ <thead>
52
+ <tr>
53
+ <th>ID</th>
54
+ <th>Domain</th>
55
+ <th>Date Range</th>
56
+ <th>Papers</th>
57
+ <th>Status</th>
58
+ <th>Started</th>
59
+ </tr>
60
+ </thead>
61
+ <tbody>
62
+ {% for r in runs %}
63
+ <tr>
64
+ <td style="font-family:var(--font-mono); font-size:0.82rem">{{ r.id }}</td>
65
+ <td>
66
+ {% if r.domain == 'aiml' %}
67
+ <span class="badge badge--accent">AI/ML</span>
68
+ {% elif r.domain == 'security' %}
69
+ <span class="badge badge--red">SECURITY</span>
70
+ {% else %}
71
+ <span class="badge" style="background:var(--bg-surface); color:var(--text-muted)">{{ r.domain | upper }}</span>
72
+ {% endif %}
73
+ </td>
74
+ <td style="font-size:0.82rem">{{ r.date_start }} to {{ r.date_end }}</td>
75
+ <td style="font-family:var(--font-mono)">{{ r.paper_count }}</td>
76
+ <td class="status-{{ r.status }}">{{ r.status }}</td>
77
+ <td style="font-size:0.82rem; color:var(--text-muted)">{{ r.started_at[:16] }}</td>
78
+ </tr>
79
+ {% endfor %}
80
+ </tbody>
81
+ </table>
82
+ {% endif %}
83
+ {% endblock %}