Spaces:
Sleeping
Sleeping
Enhance README layout and metadata
Browse files
README.md
CHANGED
|
@@ -1,57 +1,92 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
-
|
| 26 |
-
-
|
| 27 |
-
|
| 28 |
-
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
-
|
| 56 |
-
|
| 57 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# π§Ή OpenEnv: Data Clean Environment
|
| 2 |
+
### The Real-World Benchmarking for Agentic Data Engineering
|
| 3 |
+
|
| 4 |
+
[](https://github.com/meta-pytorch/OpenEnv)
|
| 5 |
+
[](https://huggingface.co/spaces/anugrah55/data_clean_env)
|
| 6 |
+
[](LICENSE)
|
| 7 |
+
|
| 8 |
---
|
| 9 |
+
|
| 10 |
+
## π Overview
|
| 11 |
+
**Data Clean Env** is a high-fidelity, production-grade [OpenEnv](https://github.com/meta-pytorch/OpenEnv) implementation designed to evaluate and train Reinforcement Learning (RL) agents on the messy, complex reality of **Data Cleaning**.
|
| 12 |
+
|
| 13 |
+
Unlike "toy" environments, this project simulates the exact workflow of a data engineer: identifying schema inconsistencies, handling missing values, casting types, and pruning noise from real-world datasets using the power of `pandas`.
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## π οΈ Environment Architecture
|
| 18 |
+
|
| 19 |
+
### π§ Action Space
|
| 20 |
+
The agent interacts with the environment through atomic, high-level data operations defined in `models.py`:
|
| 21 |
+
|
| 22 |
+
| Action | Parameters | Description |
|
| 23 |
+
| :--- | :--- | :--- |
|
| 24 |
+
| `fill_na` | `column_name`, `value` | Replaces missing values with a specific constant. |
|
| 25 |
+
| `drop_na` | `column_name` | Removes rows containing missing data in the target column. |
|
| 26 |
+
| `drop_column`| `column_name` | Deletes irrelevant or noisy features from the dataset. |
|
| 27 |
+
| `rename_column`| `column_name`, `value`| Fixes naming inconsistencies to match target schemas. |
|
| 28 |
+
| `change_type` | `column_name`, `value` | Casts columns to `int`, `float`, or `str` for downstream compatibility. |
|
| 29 |
+
| `submit` | - | Finalizes the cleaning process and triggers the programmatic grader. |
|
| 30 |
+
|
| 31 |
+
### ποΈ Observation Space
|
| 32 |
+
The agent perceives the state of the data through a detailed schema:
|
| 33 |
+
- **`df_schema`**: Real-time dictionary of column data types.
|
| 34 |
+
- **`missing_values`**: Current counts of `NaN` values per column.
|
| 35 |
+
- **`head`**: A preview of the first 5 rows to identify formatting patterns.
|
| 36 |
+
- **`feedback`**: Semantic descriptions of the impact of the last action.
|
| 37 |
+
|
| 38 |
---
|
| 39 |
|
| 40 |
+
## π Task Progression & Grading
|
| 41 |
+
|
| 42 |
+
Each task is evaluated by a **deterministic programmatic grader** that compares the agent's output against a "Gold Standard" target, producing a score strictly between **(0.0, 1.0)**.
|
| 43 |
+
|
| 44 |
+
1. **π’ Easy (`easy_clean`)**:
|
| 45 |
+
- **Goal**: Basic imputation.
|
| 46 |
+
- **Challenge**: Fill missing 'age' values.
|
| 47 |
+
2. **π‘ Medium (`medium_clean`)**:
|
| 48 |
+
- **Goal**: Noise reduction.
|
| 49 |
+
- **Challenge**: Handle missing values across multiple columns and remove "junk" features.
|
| 50 |
+
3. **π΄ Hard (`hard_clean`)**:
|
| 51 |
+
- **Goal**: Full schema alignment.
|
| 52 |
+
- **Challenge**: Rename columns, perform safe type casting on dirty strings, and handle complex missing value fallbacks.
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
## π Quick Start
|
| 57 |
+
|
| 58 |
+
### π³ Run with Docker
|
| 59 |
+
```bash
|
| 60 |
+
# Build the production image
|
| 61 |
+
docker build -t openenv_data_clean:latest -f server/Dockerfile .
|
| 62 |
+
|
| 63 |
+
# Start the environment server
|
| 64 |
+
docker run -p 8000:8000 openenv_data_clean:latest
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
### π§ͺ Baseline Inference
|
| 68 |
+
We provide a deterministic, zero-temperature baseline script using the OpenAI client:
|
| 69 |
+
```bash
|
| 70 |
+
export HF_TOKEN="your_huggingface_token"
|
| 71 |
+
export IMAGE_NAME="openenv_data_clean:latest"
|
| 72 |
+
python inference.py
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## βοΈ Reward Shaping
|
| 78 |
+
Our reward function is designed for efficient RL convergence:
|
| 79 |
+
- **Incremental Progress**: `+0.1` for every valid schema improvement.
|
| 80 |
+
- **Penalization**: `-0.05` for invalid operations (e.g., targetting non-existent columns).
|
| 81 |
+
- **Completion Bonus**: A final reward scaling with the total grader score `[0.01 - 0.99]`.
|
| 82 |
+
|
| 83 |
+
---
|
| 84 |
+
|
| 85 |
+
## π― Meta Hackathon Compliance
|
| 86 |
+
- β
**Typed Models**: Fully Pydantic-powered `Observation` and `Action`.
|
| 87 |
+
- β
**API Standard**: Implements `step()`, `reset()`, and `state()`.
|
| 88 |
+
- β
**Strict Logs**: Emits `[START]`, `[STEP]`, and `[END]` traces exactly as required.
|
| 89 |
+
- β
**Robustness**: Handles network timeouts and invalid JSON gracefully.
|
| 90 |
+
|
| 91 |
+
---
|
| 92 |
+
Built with β€οΈ for the Meta & Hugging Face OpenEnv Hackathon.
|