Spaces:
Sleeping
Sleeping
Add CLAUDE.md with architecture and dev commands
Browse files
CLAUDE.md
ADDED
|
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# CLAUDE.md
|
| 2 |
+
|
| 3 |
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
| 4 |
+
|
| 5 |
+
## Commands
|
| 6 |
+
|
| 7 |
+
```bash
|
| 8 |
+
# Run the server locally (port 7860)
|
| 9 |
+
python3 server.py
|
| 10 |
+
|
| 11 |
+
# Run tests
|
| 12 |
+
python3 tests/test_env.py
|
| 13 |
+
|
| 14 |
+
# Validate OpenEnv compliance
|
| 15 |
+
openenv validate
|
| 16 |
+
|
| 17 |
+
# Run pre-submission validation (requires Docker Desktop running)
|
| 18 |
+
/tmp/validate-submission.sh https://mihir1107-dateselectenv.hf.space .
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
## Architecture
|
| 22 |
+
|
| 23 |
+
This is an OpenEnv-compliant RL environment simulating active learning / data curation under budget constraints.
|
| 24 |
+
|
| 25 |
+
**Entry point:** `server.py` — FastAPI app exposing all endpoints. Run directly with `python3 server.py`. The `server/` package (`server/__init__.py` + `server/app.py`) is a shim that exists solely for `openenv validate` compliance (`[project.scripts]` entry point); it loads `server.py` via `importlib` to avoid the naming conflict between `server.py` and `server/`.
|
| 26 |
+
|
| 27 |
+
**Core environment:** `env.py`
|
| 28 |
+
- `DataSelectEnv.reset()` — generates dataset via `make_classification`, injects noise (guaranteed label flip via `1 - y[mask]`), warms up `SGDClassifier` for 10 passes on 100 seed samples
|
| 29 |
+
- `DataSelectEnv.step(action)` — selects batch using weighted combination of uncertainty/diversity/random sampling, incrementally trains model via `partial_fit`, computes shaped reward
|
| 30 |
+
- Data split: `X[:100]` seed | `X[200:400]` validation | `X[400:]` pool (~1100 samples)
|
| 31 |
+
- Performance metric: `1 / (1 + log_loss)` on validation set
|
| 32 |
+
|
| 33 |
+
**Reward design** (`env.py` step()):
|
| 34 |
+
- `gain = perf_delta * 5.0 + mean_distance_from_centroid * 0.05`
|
| 35 |
+
- `noise_penalty = (1.0 + flip_y * 2.0) * noise_ratio` — scales with task difficulty
|
| 36 |
+
- `reward = gain - 0.01 * batch_size - 0.3 * redundancy - noise_penalty + 0.15`
|
| 37 |
+
- Noise trap: entropy of noisy pool samples is boosted by `min(0.1 + flip_y*2.0, 0.55)` so uncertainty sampling is attracted to noisy samples (hurts uncertainty-only strategies on medium/hard)
|
| 38 |
+
|
| 39 |
+
**Sampling:** `sampling.py` — `sample_uncertainty`, `sample_diversity`, `sample_random`. Weights normalized internally. `min_batch=5` enforced unless action is `stop`.
|
| 40 |
+
|
| 41 |
+
**Tasks** (defined in `server.py` TASKS dict):
|
| 42 |
+
- `easy`: `flip_y=0.05`, budget=300, max_steps=15, grader range [0.55, 0.75]
|
| 43 |
+
- `medium`: `flip_y=0.25`, budget=150, max_steps=12, grader: `0.6*perf + 0.4*noise_score`
|
| 44 |
+
- `hard`: `flip_y=0.30`, budget=100, max_steps=8, grader: `0.65*perf + 0.35*efficiency`
|
| 45 |
+
|
| 46 |
+
**Episode persistence:** `/grader` works after a subsequent `/reset` because completed episodes are stored in `_completed` dict keyed by `episode_id`. The `/ws` WebSocket endpoint has fully isolated per-connection state (does not share `store`).
|
| 47 |
+
|
| 48 |
+
**Deployment:**
|
| 49 |
+
- Dockerfile: `CMD ["python", "server.py"]` — uses root `server.py` directly
|
| 50 |
+
- HF Space: `https://huggingface.co/spaces/Mihir1107/DateSelectEnv`
|
| 51 |
+
- GitHub: `https://github.com/Mihir1107/DataSelectEnv`
|
| 52 |
+
- `uvicorn.run(app, ...)` uses the app object directly (not `"server:app"` string) to avoid `server/` package shadowing `server.py`
|