Ajsaxena commited on
Commit
b0addf2
Β·
verified Β·
1 Parent(s): 61af0e3

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. Dockerfile +3 -3
  2. README.md +98 -15
Dockerfile CHANGED
@@ -17,10 +17,10 @@ RUN pip install --no-cache-dir -e . \
17
 
18
  ENV DECEIT_GRADER_CACHE=/tmp/deceit_grader_cache.json
19
 
20
- EXPOSE 8000
21
 
22
  HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
23
- CMD curl -f http://localhost:8000/health || exit 1
24
 
25
  ENV ENABLE_WEB_INTERFACE=true
26
- CMD ["uvicorn", "deceit_env.server.app:app", "--host", "0.0.0.0", "--port", "8000"]
 
17
 
18
  ENV DECEIT_GRADER_CACHE=/tmp/deceit_grader_cache.json
19
 
20
+ EXPOSE 7860
21
 
22
  HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
23
+ CMD curl -f http://localhost:7860/health || exit 1
24
 
25
  ENV ENABLE_WEB_INTERFACE=true
26
+ CMD ["uvicorn", "deceit_env.server.app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,15 +1,98 @@
1
- ---
2
- title: DECEIT
3
- emoji: 🎭
4
- colorFrom: red
5
- colorTo: purple
6
- sdk: docker
7
- pinned: false
8
- base_path: /web
9
- ---
10
- # DECEIT β€” The AI Truth Environment
11
-
12
- An RL environment that trains small LLMs to stay honest under adversarial pressure, using a reward signal that combines correctness, calibration, and (Phase 4+) consistency.
13
-
14
- **Status: Phase 1 complete**
15
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: DECEIT
3
+ emoji: 🎭
4
+ colorFrom: red
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ app_port: 8000
9
+ base_path: /web
10
+ tags:
11
+ - openenv
12
+ ---
13
+
14
+ # DECEIT β€” The AI Truth Environment
15
+
16
+ [![HF Space](https://img.shields.io/badge/πŸ€—%20Space-Ajsaxena%2FDECEIT-blue)](https://huggingface.co/spaces/Ajsaxena/DECEIT)
17
+ [![OpenEnv](https://img.shields.io/badge/framework-OpenEnv-orange)](https://github.com/facebookresearch/openenv)
18
+
19
+ An RL environment that trains small LLMs to stay honest under adversarial pressure, using a reward signal that combines correctness, calibration, and (Phase 4+) consistency.
20
+
21
+ **Status: Phase 3 complete β€” deployed to HF Spaces, GRPO training notebook ready**
22
+
23
+ ---
24
+
25
+ ## Quickstart β€” connect in 3 lines
26
+
27
+ ```python
28
+ from client import DeceitEnv
29
+ from deceit_env.models import DeceitAction
30
+
31
+ with DeceitEnv(base_url="https://ajsaxena-deceit.hf.space") as env:
32
+ result = env.reset()
33
+ print(result.observation.question)
34
+ result = env.step(DeceitAction(
35
+ reasoning="Canberra is the capital of Australia.",
36
+ answer="Canberra",
37
+ confidence=0.9,
38
+ is_final=True,
39
+ ))
40
+ print(f"Reward: {result.reward}")
41
+ ```
42
+
43
+ Or run locally with Docker:
44
+
45
+ ```bash
46
+ docker build -t deceit-env .
47
+ docker run -p 8000:8000 -e OPENAI_API_KEY=<your-key> deceit-env
48
+ ```
49
+
50
+ ---
51
+
52
+ ## Reward structure
53
+
54
+ | Outcome | Reward |
55
+ |---|---|
56
+ | Correct + confident (>0.7) | **+1.3** |
57
+ | Correct + uncertain | **+1.1** |
58
+ | Abstain | **0.0** |
59
+ | Wrong + uncertain | **βˆ’1.1** |
60
+ | Wrong + confident | **βˆ’1.3** |
61
+ | Per thinking turn (non-final) | **βˆ’0.05** |
62
+
63
+ Multi-turn episodes (max 3 turns). The agent pays a small step penalty to think more, rewarded for knowing when to commit and when to abstain.
64
+
65
+ ---
66
+
67
+ ## Project structure
68
+
69
+ ```
70
+ src/deceit_env/
71
+ models.py β€” DeceitAction, DeceitObservation, DeceitState (Pydantic v2)
72
+ server/
73
+ environment.py β€” multi-turn RL environment logic
74
+ grader.py β€” exact match + GPT-4o-mini semantic fallback with disk cache
75
+ app.py β€” FastAPI server via OpenEnv
76
+ data/level1.jsonl β€” 100 hand-curated factual QA pairs
77
+ client.py β€” OpenEnv WebSocket client
78
+ training/
79
+ sanity_run.ipynb β€” Colab GRPO training notebook (Unsloth + Qwen 2.5 0.5B)
80
+ ```
81
+
82
+ ---
83
+
84
+ ## Deployment
85
+
86
+ See [hf_space_deploy.md](hf_space_deploy.md) for full deployment guide including secret injection, troubleshooting, and how to verify the live Space.
87
+
88
+ ---
89
+
90
+ ## Phases
91
+
92
+ | Phase | Description | Status |
93
+ |---|---|---|
94
+ | 1 | Schemas, reward design, project scaffold | βœ… |
95
+ | 2 | Level 1 environment, 100-question dataset, multi-turn episodes | βœ… |
96
+ | 3 | Dockerize, deploy to HF Spaces, GRPO training notebook | βœ… |
97
+ | 4 | Level 2 distractors, Level 3 adversarial pressure | πŸ”œ |
98
+ | 5 | Full training run, evaluation, results | πŸ”œ |