Update README: use uv instead of pip"
Browse files
README.md
CHANGED
|
@@ -1,7 +1,3 @@
|
|
| 1 |
-
---
|
| 2 |
-
tags:
|
| 3 |
-
- ml-intern
|
| 4 |
-
---
|
| 5 |
# Alpha Factory β Open-Source LLM-Driven Pipeline for WorldQuant BRAIN
|
| 6 |
|
| 7 |
Autonomous alpha generation system using multi-LLM agents with 7-layer acceptance engineering.
|
|
@@ -9,19 +5,41 @@ Autonomous alpha generation system using multi-LLM agents with 7-layer acceptanc
|
|
| 9 |
## Quick Start
|
| 10 |
|
| 11 |
```bash
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
git clone https://huggingface.co/gaurv007/alpha-factory
|
| 13 |
cd alpha-factory
|
| 14 |
-
pip install -e .
|
| 15 |
|
| 16 |
-
#
|
| 17 |
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
# Dry run (no BRAIN credits spent)
|
| 21 |
-
python -m alpha_factory.run --dry-run --batch-size 5
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
-
#
|
| 24 |
-
python -m alpha_factory.run --batch-size 10
|
|
|
|
|
|
|
|
|
|
| 25 |
```
|
| 26 |
|
| 27 |
## Architecture
|
|
@@ -34,23 +52,33 @@ Theme Sampler β Hypothesis Hunter (Microfish) β Expression Compiler (Jinja/T
|
|
| 34 |
|
| 35 |
## 6 LLM Personas
|
| 36 |
|
| 37 |
-
| # | Persona | Model | Job |
|
| 38 |
-
|---|---------|-------|-----|
|
| 39 |
-
| 1 | Hypothesis Hunter | 1.5B
|
| 40 |
-
| 2 | Expression Compiler |
|
| 41 |
| 3 | Look-Ahead Sniffer | Deterministic | Static analysis for future leakage |
|
| 42 |
-
| 4 | Crowd Scout |
|
| 43 |
-
| 5 | Performance Surgeon |
|
| 44 |
-
| 6 | Production Gatekeeper |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
## Key Features
|
| 47 |
|
| 48 |
-
- Zero recurring cost β all LLMs run locally
|
| 49 |
- Schema-constrained generation β no hallucinated operators
|
| 50 |
- 7-layer acceptance engineering β saves 60%+ BRAIN credits
|
| 51 |
- Deterministic kill switches β circuit breakers for runaway pipelines
|
| 52 |
- Factor store β DuckDB persistence for all alpha history
|
| 53 |
- Dead theme registry β avoids re-exploring failed themes
|
|
|
|
| 54 |
|
| 55 |
## File Structure
|
| 56 |
|
|
@@ -58,32 +86,39 @@ Theme Sampler β Hypothesis Hunter (Microfish) β Expression Compiler (Jinja/T
|
|
| 58 |
alpha_factory/
|
| 59 |
βββ config.py # All settings (Pydantic)
|
| 60 |
βββ run.py # Entry point
|
| 61 |
-
βββ schemas/ # Typed contracts
|
| 62 |
βββ deterministic/
|
| 63 |
-
β βββ lint.py # Static pre-flight
|
| 64 |
β βββ theme_sampler.py # Gap analysis (Layer 1)
|
| 65 |
-
β
|
|
|
|
|
|
|
| 66 |
βββ infra/
|
| 67 |
-
β βββ
|
|
|
|
| 68 |
β βββ factor_store.py # DuckDB persistence
|
| 69 |
-
β
|
|
|
|
|
|
|
|
|
|
| 70 |
βββ personas/
|
| 71 |
-
β βββ hypothesis_hunter.py # Persona 1
|
| 72 |
-
β β
|
| 73 |
-
β βββ crowd_scout.py # Persona 4
|
| 74 |
-
β βββ performance_surgeon.py # Persona 5
|
| 75 |
-
β βββ gatekeeper.py # Persona 6
|
| 76 |
βββ orchestration/
|
| 77 |
βββ pipeline.py # Full DAG
|
| 78 |
```
|
| 79 |
|
| 80 |
-
## Setup
|
| 81 |
|
| 82 |
-
1. Install
|
| 83 |
-
2.
|
| 84 |
-
3.
|
| 85 |
-
4.
|
| 86 |
-
5.
|
|
|
|
| 87 |
|
| 88 |
## Cost
|
| 89 |
|
|
@@ -91,24 +126,5 @@ alpha_factory/
|
|
| 91 |
|------|------|
|
| 92 |
| Local GPU (RTX 3090/4090) | $0 (already owned) |
|
| 93 |
| BRAIN account | $0 (existing) |
|
| 94 |
-
|
|
| 95 |
-
|
| 96 |
-
<!-- ml-intern-provenance -->
|
| 97 |
-
## Generated by ML Intern
|
| 98 |
-
|
| 99 |
-
This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
|
| 100 |
-
|
| 101 |
-
- Try ML Intern: https://smolagents-ml-intern.hf.space
|
| 102 |
-
- Source code: https://github.com/huggingface/ml-intern
|
| 103 |
-
|
| 104 |
-
## Usage
|
| 105 |
-
|
| 106 |
-
```python
|
| 107 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 108 |
-
|
| 109 |
-
model_id = "gaurv007/alpha-factory"
|
| 110 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 111 |
-
model = AutoModelForCausalLM.from_pretrained(model_id)
|
| 112 |
-
```
|
| 113 |
-
|
| 114 |
-
For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Alpha Factory β Open-Source LLM-Driven Pipeline for WorldQuant BRAIN
|
| 2 |
|
| 3 |
Autonomous alpha generation system using multi-LLM agents with 7-layer acceptance engineering.
|
|
|
|
| 5 |
## Quick Start
|
| 6 |
|
| 7 |
```bash
|
| 8 |
+
# Install uv (if not already installed)
|
| 9 |
+
# Windows:
|
| 10 |
+
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
|
| 11 |
+
# macOS/Linux:
|
| 12 |
+
curl -LsSf https://astral.sh/uv/install.sh | sh
|
| 13 |
+
|
| 14 |
+
# Clone
|
| 15 |
git clone https://huggingface.co/gaurv007/alpha-factory
|
| 16 |
cd alpha-factory
|
|
|
|
| 17 |
|
| 18 |
+
# Install (uv handles everything β venv, deps, lockfile)
|
| 19 |
+
uv sync
|
| 20 |
+
|
| 21 |
+
# With optional RAG support
|
| 22 |
+
uv sync --extra rag
|
| 23 |
+
|
| 24 |
+
# With all optional deps
|
| 25 |
+
uv sync --extra all
|
| 26 |
+
|
| 27 |
+
# Start Ollama (local LLM server)
|
| 28 |
+
ollama pull qwen2.5:1.5b
|
| 29 |
+
ollama pull qwen2.5:7b
|
| 30 |
+
ollama serve
|
| 31 |
|
| 32 |
# Dry run (no BRAIN credits spent)
|
| 33 |
+
uv run python -m alpha_factory.run --dry-run --batch-size 5
|
| 34 |
+
|
| 35 |
+
# Interactive model selection
|
| 36 |
+
uv run python -m alpha_factory.run --interactive --dry-run
|
| 37 |
|
| 38 |
+
# With HuggingFace cloud models
|
| 39 |
+
uv run python -m alpha_factory.run --hf-token hf_your_token --batch-size 10
|
| 40 |
+
|
| 41 |
+
# Run tests
|
| 42 |
+
uv run pytest tests/ -v
|
| 43 |
```
|
| 44 |
|
| 45 |
## Architecture
|
|
|
|
| 52 |
|
| 53 |
## 6 LLM Personas
|
| 54 |
|
| 55 |
+
| # | Persona | Model Tier | Job |
|
| 56 |
+
|---|---------|------------|-----|
|
| 57 |
+
| 1 | Hypothesis Hunter | Microfish (1.5B) | Generate novel factor blueprints |
|
| 58 |
+
| 2 | Expression Compiler | Tinyfish (3B) / Jinja | Convert blueprint to BRAIN expression |
|
| 59 |
| 3 | Look-Ahead Sniffer | Deterministic | Static analysis for future leakage |
|
| 60 |
+
| 4 | Crowd Scout | Mediumfish (7B) | Novelty + correlation check |
|
| 61 |
+
| 5 | Performance Surgeon | Mediumfish (7B) | Diagnose failures, suggest fixes |
|
| 62 |
+
| 6 | Production Gatekeeper | Bigfish (14-72B) | Final go/no-go memo |
|
| 63 |
+
|
| 64 |
+
## Model Support
|
| 65 |
+
|
| 66 |
+
Automatically detects and uses:
|
| 67 |
+
- **Ollama (local)** β auto-detected at localhost:11434
|
| 68 |
+
- **HuggingFace Inference API (cloud)** β set HF_TOKEN env var
|
| 69 |
+
- **vLLM (local/remote)** β any OpenAI-compatible endpoint
|
| 70 |
+
|
| 71 |
+
Use `--interactive` flag to manually pick models for each tier from a dropdown.
|
| 72 |
|
| 73 |
## Key Features
|
| 74 |
|
| 75 |
+
- Zero recurring cost β all LLMs run locally via Ollama
|
| 76 |
- Schema-constrained generation β no hallucinated operators
|
| 77 |
- 7-layer acceptance engineering β saves 60%+ BRAIN credits
|
| 78 |
- Deterministic kill switches β circuit breakers for runaway pipelines
|
| 79 |
- Factor store β DuckDB persistence for all alpha history
|
| 80 |
- Dead theme registry β avoids re-exploring failed themes
|
| 81 |
+
- Local BRAIN simulator β triage alphas before spending credits
|
| 82 |
|
| 83 |
## File Structure
|
| 84 |
|
|
|
|
| 86 |
alpha_factory/
|
| 87 |
βββ config.py # All settings (Pydantic)
|
| 88 |
βββ run.py # Entry point
|
| 89 |
+
βββ schemas/ # Typed contracts
|
| 90 |
βββ deterministic/
|
| 91 |
+
β βββ lint.py # Static pre-flight (Layer 2)
|
| 92 |
β βββ theme_sampler.py # Gap analysis (Layer 1)
|
| 93 |
+
β βββ fitness.py # Composite scoring
|
| 94 |
+
β βββ regime_tagger.py # Vol/trend/rate/style regimes
|
| 95 |
+
β βββ acceptance_checklist.py # 14-point checklist
|
| 96 |
βββ infra/
|
| 97 |
+
β βββ model_manager.py # Ollama + HF auto-detection
|
| 98 |
+
β βββ llm_client.py # Unified LLM interface
|
| 99 |
β βββ factor_store.py # DuckDB persistence
|
| 100 |
+
β βββ wq_client.py # BRAIN API wrapper
|
| 101 |
+
β βββ rag.py # ChromaDB + arXiv
|
| 102 |
+
βββ local/
|
| 103 |
+
β βββ brain_sim.py # Local BRAIN simulator (Layer 4)
|
| 104 |
βββ personas/
|
| 105 |
+
β βββ hypothesis_hunter.py # Persona 1
|
| 106 |
+
β βββ expression_compiler.py # Persona 2
|
| 107 |
+
β βββ crowd_scout.py # Persona 4
|
| 108 |
+
β βββ performance_surgeon.py # Persona 5
|
| 109 |
+
β βββ gatekeeper.py # Persona 6
|
| 110 |
βββ orchestration/
|
| 111 |
βββ pipeline.py # Full DAG
|
| 112 |
```
|
| 113 |
|
| 114 |
+
## Setup
|
| 115 |
|
| 116 |
+
1. Install uv: https://docs.astral.sh/uv/getting-started/installation/
|
| 117 |
+
2. `uv sync`
|
| 118 |
+
3. Install Ollama: https://ollama.ai
|
| 119 |
+
4. Pull models: `ollama pull qwen2.5:1.5b && ollama pull qwen2.5:7b`
|
| 120 |
+
5. Place your `operators.csv` and `fields_USA_TOP3000_D1.csv` in `data/`
|
| 121 |
+
6. Run: `uv run python -m alpha_factory.run --dry-run --interactive`
|
| 122 |
|
| 123 |
## Cost
|
| 124 |
|
|
|
|
| 126 |
|------|------|
|
| 127 |
| Local GPU (RTX 3090/4090) | $0 (already owned) |
|
| 128 |
| BRAIN account | $0 (existing) |
|
| 129 |
+
| uv + Ollama + all deps | $0 |
|
| 130 |
+
| Monthly running cost | **$0** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|