gaurv007
/

alpha-factory

@@ -26,10 +26,12 @@ The pipeline runs:
 3. **Static lint** — validates BRAIN syntax (operator arity, look-ahead, parentheses)
 4. **Deduplication** — SHA256 hash to avoid duplicates
 5. **Store** — persists to DuckDB for review
-6. **Crowd Scout** — novelty assessment (LLM or heuristic)
-7. **Performance Surgeon** — diagnoses weak alphas, suggests mutations
-8. **Gatekeeper** — final go/no-go (LLM)
 9. **BRAIN submission** — live submission (requires `BRAIN_SESSION_TOKEN`)
 ## Quick Start
@@ -81,8 +83,8 @@ BRAIN uses **session-based authentication** (browser cookies), not API keys. The
 ## Architecture
 ```
-Theme Sampler → Expression Generation → Static Lint → Dedup → Store
-     ↓              (Templates or LLM)     ↓           ↓
 Crowd Scout → Performance Surgeon → Gatekeeper → BRAIN Submit
                    ↓ (iteration queue)
 Winner Memory ← Mutator ← Performance Surgeon
@@ -100,8 +102,8 @@ Winner Memory ← Mutator ← Performance Surgeon
 | `performance_surgeon.py` | Diagnose failures, suggest mutations | ✅ Working |
 | `gatekeeper.py` | Final go/no-go memo | ✅ Working |
 | `wq_client.py` | BRAIN API submission | ⚠️ Needs `BRAIN_SESSION_TOKEN` |
-| `brain_sim.py` | Local numpy backtest | ⚠️ Not wired to pipeline |
-| `regime_tagger.py` | Vol/trend/rate/style regimes | ⚠️ Not wired to pipeline |
 ## Key Features
@@ -113,16 +115,16 @@ Winner Memory ← Mutator ← Performance Surgeon
 - **Winner memory**: Tracks which field/archetype combinations work, feeds back to generation.
 - **Expression mutator**: Auto-generates decay, horizon, neutralization, and sign-flip variants.
 - **DuckDB store**: Persistent history of all alphas, metrics, and verdicts.
-- **Retry logic**: LLM client retries transient failures (429, 502, 503, 504, timeout) with exponential backoff.
 ## Known Limitations
 1. **BRAIN auth is session-based**: Token expires. No automatic refresh. You must re-copy from browser.
-2. **Local simulation is not wired**: `brain_sim.py` exists but is not integrated into the pipeline. It needs price data (yfinance) and produces approximate results.
-3. **Regime tagger not wired**: `regime_tagger.py` exists but is not used by the Performance Surgeon.
-4. **LLM generation can hallucinate fields**: Static lint catches most errors, but field names from LLMs may not exist on BRAIN.
-5. **Weights inside `rank()` are decorative**: `rank(0.6*a + 0.4*b)` is monotonic — coefficients don't linearly combine. The signal comes from which fields are combined.
-6. **Not a guarantee of profitable alphas**: This generates candidates. BRAIN's simulation is the ground truth.
 ## Configuration
@@ -154,7 +156,7 @@ python -m alpha_factory.run --proven --batch-size 10 --enable-brain
 ```
 alpha_factory/
 ├── config.py                  # All settings (Pydantic v2)
-├── run.py                     # Entry point
 ├── schemas/                   # Typed Pydantic contracts
 ├── deterministic/
 │   ├── lint.py                # Static pre-flight (Layer 2)
@@ -162,23 +164,25 @@ alpha_factory/
 │   ├── fitness.py             # Composite scoring
 │   ├── proven_templates.py    # Deterministic generation
 │   ├── expression_mutator.py  # Evolutionary variants
-│   └── regime_tagger.py       # Vol/trend/rate/style regimes (not wired)
 ├── infra/
 │   ├── model_manager.py       # Ollama + HF auto-detection
 │   ├── llm_client.py          # Unified LLM interface (token budget + retry)
-│   ├── factor_store.py        # DuckDB persistence
-│   ├── wq_client.py           # BRAIN API wrapper (session auth)
 │   └── winner_memory.py       # Feedback loop
 ├── local/
-│   └── brain_sim.py           # Local BRAIN simulator (not wired)
 ├── personas/
-│   ├── hypothesis_hunter.py   # Persona 1
-│   ├── expression_compiler.py # Persona 2
-│   ├── crowd_scout.py         # Persona 4
-│   ├── performance_surgeon.py # Persona 5
-│   └── gatekeeper.py          # Persona 6
 └── orchestration/
-    └── pipeline.py            # Full DAG
 ```
 ## Changelog v0.2.0
@@ -191,16 +195,29 @@ alpha_factory/
 - **Fixed**: Expression compiler sign logic — now per-component, no global blind negation
 - **Fixed**: LLM client stops error amplification (no more 3x API calls on auth/network failures)
 - **Fixed**: LLM client enforces token budget (was declared but never checked)
-- **Fixed**: LLM client adds retry logic with exponential backoff for transient failures (429, 502, 503, 504, timeout)
-- **Fixed**: Removed dead `enable_local_sim` config field and `--local-sim` CLI flag (local sim exists but is not wired)
 - **Fixed**: Removed orphan `rag.py` (arXiv retrieval not wired, will be re-added when integrated)
-- **Fixed**: Added missing `local/__init__.py` for proper package structure
-- **Fixed**: Added GitHub Actions CI workflow (`.github/workflows/ci.yml`)
 - **New**: Proven template mode (`--proven`) generates expressions without any LLM
 - **New**: Winner memory integration in pipeline (records winners/failures/iterations)
 - **New**: Expression mutator integration (auto-generates decay/horizon/group/sign variants)
 - **New**: Parallel batch processing with `max_parallel_candidates` semaphore
-- **New**: 32 comprehensive tests covering templates, lint, mutations, config, fitness, fields, groups
 - **Updated**: Honest README that accurately describes what works and what doesn't
 ## License
@@ -210,19 +227,7 @@ MIT — use at your own risk. This is not financial advice. BRAIN simulations ar
 <!-- ml-intern-provenance -->
 ## Generated by ML Intern
-This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
 - Try ML Intern: https://smolagents-ml-intern.hf.space
 - Source code: https://github.com/huggingface/ml-intern
-## Usage
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = 'gaurv007/alpha-factory'
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
-```
-For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.

 3. **Static lint** — validates BRAIN syntax (operator arity, look-ahead, parentheses)
 4. **Deduplication** — SHA256 hash to avoid duplicates
 5. **Store** — persists to DuckDB for review
+6. **Local sim** — lightweight numpy backtest as triage (lenient thresholds, never blocks)
+7. **Acceptance checklist** — 14-point pre-submission gate
+8. **Crowd Scout** — novelty assessment (LLM or heuristic)
 9. **BRAIN submission** — live submission (requires `BRAIN_SESSION_TOKEN`)
+10. **Performance Surgeon** — diagnoses weak alphas, suggests mutations
+11. **Gatekeeper** — final go/no-go (LLM)
 ## Quick Start
 ## Architecture
 ```
+Theme Sampler → Expression Generation → Static Lint → Dedup → Store → Local Sim → Checklist
+     ↓              (Templates or LLM)     ↓           ↓             ↓             ↓
 Crowd Scout → Performance Surgeon → Gatekeeper → BRAIN Submit
                    ↓ (iteration queue)
 Winner Memory ← Mutator ← Performance Surgeon
 | `performance_surgeon.py` | Diagnose failures, suggest mutations | ✅ Working |
 | `gatekeeper.py` | Final go/no-go memo | ✅ Working |
 | `wq_client.py` | BRAIN API submission | ⚠️ Needs `BRAIN_SESSION_TOKEN` |
+| `brain_sim.py` | Local numpy backtest (triage, lenient) | ✅ Wired (never blocks) |
+| `regime_tagger.py` | Vol/trend/rate/style regimes | ✅ Wired via Performance Surgeon |
 ## Key Features
 - **Winner memory**: Tracks which field/archetype combinations work, feeds back to generation.
 - **Expression mutator**: Auto-generates decay, horizon, neutralization, and sign-flip variants.
 - **DuckDB store**: Persistent history of all alphas, metrics, and verdicts.
+- **Retry logic**: LLM client retries transient failures (429, 502, 503, 504, timeout) with exponential backoff. Non-retryable errors (401, 400, OOM) abort immediately.
+- **Unified pipeline**: Both proven and LLM paths flow through `_process_candidate()` — no code duplication.
 ## Known Limitations
 1. **BRAIN auth is session-based**: Token expires. No automatic refresh. You must re-copy from browser.
+2. **Local simulation is triage-only**: `brain_sim.py` runs with lenient thresholds (min_sharpe=0.3) and prints warnings but **never blocks** a candidate. It's for sanity checking, not filtering.
+3. **LLM generation can hallucinate fields**: Static lint catches most errors, but field names from LLMs may not exist on BRAIN.
+4. **Weights inside `rank()` are decorative**: `rank(0.6*a + 0.4*b)` is monotonic — coefficients don't linearly combine. The signal comes from which fields are combined.
+5. **Not a guarantee of profitable alphas**: This generates candidates. BRAIN's simulation is the ground truth.
 ## Configuration
 ```
 alpha_factory/
 ├── config.py                  # All settings (Pydantic v2)
+├── run.py                     # Entry point (single asyncio.run)
 ├── schemas/                   # Typed Pydantic contracts
 ├── deterministic/
 │   ├── lint.py                # Static pre-flight (Layer 2)
 │   ├── fitness.py             # Composite scoring
 │   ├── proven_templates.py    # Deterministic generation
 │   ├── expression_mutator.py  # Evolutionary variants
+│   ├── acceptance_checklist.py # 14-point pre-submission gate
+│   ├── brain_sim.py           # Local numpy backtest (triage)
+│   └── regime_tagger.py       # IQR-based regime detection
 ├── infra/
 │   ├── model_manager.py       # Ollama + HF auto-detection
 │   ├── llm_client.py          # Unified LLM interface (token budget + retry)
+│   ├── factor_store.py        # DuckDB persistence (parameterized SQL)
+│   ├── wq_client.py           # BRAIN API wrapper (session auth, circuit breaker)
 │   └── winner_memory.py       # Feedback loop
 ├── local/
+│   └── brain_sim.py           # (identical, part of deterministic)
 ├── personas/
+│   ├── hypothesis_hunter.py   # Persona 1 (LLM)
+│   ├── expression_compiler.py # Persona 2 (templates + LLM fallback)
+│   ├── crowd_scout.py         # Persona 4 (heuristic + LLM)
+│   ├── performance_surgeon.py # Persona 5 (heuristic + LLM)
+│   └── gatekeeper.py          # Persona 6 (LLM)
 └── orchestration/
+    └── pipeline.py            # Full DAG (unified _process_candidate)
 ```
 ## Changelog v0.2.0
 - **Fixed**: Expression compiler sign logic — now per-component, no global blind negation
 - **Fixed**: LLM client stops error amplification (no more 3x API calls on auth/network failures)
 - **Fixed**: LLM client enforces token budget (was declared but never checked)
+- **Fixed**: LLM client adds retry logic with exponential backoff for transient failures
+- **Fixed**: LLM client JSON parsing regex no longer strips all whitespace (was mangling responses)
+- **Fixed**: `pipeline.py` `NameError: max_corr` — correlation is now computed before checklist call
+- **Fixed**: `pipeline.py` `_submit_or_dryrun` reuses `self.brain` instead of creating new clients
+- **Fixed**: `run.py` uses single `asyncio.run()` — no more session leak
+- **Fixed**: `acceptance_checklist.py` RETURNS-CORR check no longer always fails (lowered from 0.05 to 0.95)
+- **Fixed**: `factor_store.py` uses DuckDB transaction context manager instead of string-based BEGIN/COMMIT
+- **Fixed**: `ui.py` SQL uses parameterized LIMIT instead of f-string injection
+- **Fixed**: `expression_compiler.py` `_validate_expression` is now called, issues logged
+- **Fixed**: `expression_mutator.py` regex now handles uppercase field IDs (e.g., `mdl77_2GlobalDev...`)
+- **Fixed**: `proven_templates.py` decay parameter is now passed through (was hardcoded to 5)
+- **Fixed**: `theme_sampler.py` `pick_theme()` has alive-theme fallback when all themes exhausted
+- **Fixed**: Removed dead `enable_local_sim` config field and `--local-sim` CLI flag
 - **Fixed**: Removed orphan `rag.py` (arXiv retrieval not wired, will be re-added when integrated)
+- **Fixed**: Added missing `local/__init__.py` and `orchestration/__init__.py`
+- **Fixed**: `pyproject.toml` version bumped to 0.2.0, removed unused `scipy` dependency
 - **New**: Proven template mode (`--proven`) generates expressions without any LLM
 - **New**: Winner memory integration in pipeline (records winners/failures/iterations)
 - **New**: Expression mutator integration (auto-generates decay/horizon/group/sign variants)
+- **New**: Acceptance checklist (14 checks, wired before BRAIN submission)
 - **New**: Parallel batch processing with `max_parallel_candidates` semaphore
+- **New**: 64+ comprehensive tests covering templates, lint, mutations, config, fitness, fields, groups
+- **New**: `_process_candidate()` unified path — both proven and LLM candidates flow through same pipeline
 - **Updated**: Honest README that accurately describes what works and what doesn't
 ## License
 <!-- ml-intern-provenance -->
 ## Generated by ML Intern
+This repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
 - Try ML Intern: https://smolagents-ml-intern.hf.space
 - Source code: https://github.com/huggingface/ml-intern