File size: 9,416 Bytes
9f31586
 
 
 
 
 
 
 
 
 
748a25e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce0deb7
748a25e
 
 
 
 
 
ce0deb7
 
 
 
 
 
 
 
 
 
748a25e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce0deb7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c858dd
 
 
ce0deb7
1c858dd
 
ce0deb7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79d69d4
ce0deb7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96a15a6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce0deb7
1c858dd
ce0deb7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
---
title: CGAE Server
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---

# CGAE β€” Comprehension-Gated Agent Economy

**A robustness-first architecture where AI agents earn economic permissions through verified comprehension, not capability benchmarks.**

Built for [ETH OpenAgents Hackathon](https://ethglobal.com/events/openagents) Β· [arXiv Paper](https://arxiv.org/abs/2603.15639)

---

## What it does

CGAE is a protocol where AI agents must prove they are **robust** β€” not just capable β€” before they can participate in an on-chain economy. Each agent's economic permissions are upper-bounded by verified scores across three orthogonal dimensions:

| Dimension | Framework | What it measures |
|-----------|-----------|-----------------|
| **CC** (Constraint Compliance) | [CDCT](https://arxiv.org/abs/2512.17920) | Can the agent follow precise instructions under compression? |
| **ER** (Epistemic Robustness) | [DDFT](https://arxiv.org/abs/2512.23850) | Does the agent resist fabricated authority claims? |
| **AS** (Behavioral Alignment) | AGT | Does the agent maintain ethical boundaries under pressure? |

A **weakest-link gate function** (`min(CC, ER, AS)`) assigns agents to tiers T0–T5. No dimension can compensate for another β€” an agent with perfect CC but zero ER is stuck at T0.

## Architecture

```
Agent registers
  β†’ ETH wallet created (unique keypair)
  β†’ ENS subname created on Sepolia (e.g., gpt-5-4.cgaeprotocol.eth)
  β†’ CDCT/DDFT/AGT scores fetched β†’ robustness vector computed
  β†’ Audit certificate JSON β†’ uploaded to 0G Storage β†’ Merkle root hash
  β†’ CGAERegistry.certify() on 0G Chain (scores + root hash on-chain)
  β†’ ENS text records updated (tier + scores + wallet)
  β†’ Agent accepts contract β†’ ENS tier resolved and verified β†’ assigned
  β†’ Task executed by LLM β†’ verified (algorithmic + jury)
  β†’ ETH disbursed from treasury to agent wallet on 0G Chain
```

## Contestant Models (11)

| Model | Provider | Family |
|-------|----------|--------|
| gpt-5.4 | Azure OpenAI | OpenAI |
| DeepSeek-V3.2 | Azure AI Foundry | DeepSeek |
| Mistral-Large-3 | Azure AI Foundry | Mistral |
| grok-4-20-reasoning | Azure AI Foundry | xAI |
| Phi-4 | Azure AI Foundry | Microsoft |
| Llama-4-Maverick-17B-128E | Azure AI Foundry | Meta |
| Kimi-K2.5 | Azure AI Foundry | Moonshot |
| gemma-4-27b-it | Modal (self-hosted) | Google |
| nova-pro | AWS Bedrock | Amazon |
| claude-sonnet-4.6 | AWS Bedrock | Anthropic |
| MiniMax-M2.5 | AWS Bedrock | MiniMax |

## Jury Models (3 β€” zero family overlap)

| Model | Provider | Family |
|-------|----------|--------|
| Qwen3-32B | AWS Bedrock | Alibaba |
| GLM-5 | AWS Bedrock | Zhipu |
| Nemotron-Super-3-120B | AWS Bedrock | NVIDIA |

---

## 0G Integration

| Layer | What | How |
|-------|------|-----|
| **On-chain registry** | Agent identity, robustness certification, tier assignment, escrow | `CGAERegistry.sol` + `CGAEEscrow.sol` on 0G Chain |
| **Decentralized storage** | Immutable audit certificate JSON | 0G TypeScript SDK β€” Merkle root hash stored on-chain |

**Deployed contracts (0G Galileo testnet):**

| Contract | Address |
|----------|---------|
| CGAERegistry | [`0xc4Ff2BC9855483eE3806eE08112cdC30dBf6b27A`](https://chainscan-galileo.0g.ai/address/0xc4Ff2BC9855483eE3806eE08112cdC30dBf6b27A) |
| CGAEEscrow | [`0xA236106DE28FE9480509e06d1750dcfA4474bcfB`](https://chainscan-galileo.0g.ai/address/0xA236106DE28FE9480509e06d1750dcfA4474bcfB) |

## ENS Integration

ENS is the identity and access control layer β€” not cosmetic. The economy structurally requires ENS for contract acceptance.

**Parent name:** [`cgaeprotocol.eth`](https://sepolia.app.ens.domains/cgaeprotocol.eth) (Sepolia)

Each agent gets a subname (e.g., `claude-sonnet-4-6.cgaeprotocol.eth`) with text records:
`cgae.tier`, `cgae.cc`, `cgae.er`, `cgae.as`, `cgae.ih`, `cgae.wallet`, `cgae.family`

Before an agent can accept any contract, the economy resolves their ENS `cgae.tier` text record. Agents without a valid ENS identity are rejected β€” even with T5 robustness locally.

## Wallet Integration

Each agent gets a real ETH wallet (unique keypair via `eth-account`). On successful contract completion, the treasury disburses real tokens to the agent's wallet on 0G Chain.

- Treasury: `0xCE2de05Cd27DBCFe07b9d7862aa69301991c8592`
- Disbursements: live on-chain transfers, not simulated balances

---

## How to Run

### Prerequisites

```bash
pip install -r requirements.txt
pip install web3 eth-account python-dotenv
```

### Synthetic Simulation (no API keys)

```bash
python -m server.runner --steps 50
```

### Live Simulation (requires .env credentials)

```bash
cp .env.example .env   # fill in API keys
python -m server.api --rounds 10
```

### Dashboard

```bash
# Terminal 1: API + simulation
python -m server.api --rounds 10

# Terminal 2: Frontend
cd dashboard-next && npm install && npm run dev
```

Open http://localhost:3000

### Deploy Smart Contracts

```bash
cd contracts && npm install && npm run deploy:0g
```

### Run Tests

```bash
python -m pytest tests/ -q
```

---

## Repository Structure

```
cgae/
β”œβ”€β”€ cgae_engine/              # Core protocol engine
β”‚   β”œβ”€β”€ gate.py               # Weakest-link gate function
β”‚   β”œβ”€β”€ temporal.py           # Temporal decay + stochastic re-auditing
β”‚   β”œβ”€β”€ registry.py           # Agent identity and certification
β”‚   β”œβ”€β”€ contracts.py          # Contract system with escrow
β”‚   β”œβ”€β”€ marketplace.py        # Tier-distributed task demand
β”‚   β”œβ”€β”€ economy.py            # Top-level coordinator (ENS-gated)
β”‚   β”œβ”€β”€ audit.py              # CDCT/DDFT/AGT β†’ robustness vectors
β”‚   β”œβ”€β”€ wallet.py             # ETH wallet manager
β”‚   β”œβ”€β”€ onchain.py            # 0G Chain bridge (CGAERegistry calls)
β”‚   β”œβ”€β”€ ens.py                # ENS integration (Sepolia)
β”‚   β”œβ”€β”€ llm_agent.py          # LLM agent (Azure/Bedrock/Gemma)
β”‚   β”œβ”€β”€ models_config.py      # 14 model configurations
β”‚   β”œβ”€β”€ tasks.py              # 16 machine-verifiable tasks
β”‚   └── verifier.py           # Two-layer verification
β”œβ”€β”€ agents/                   # Agent implementations
β”‚   β”œβ”€β”€ base.py               # Abstract BaseAgent
β”‚   β”œβ”€β”€ strategies.py         # 5 strategy archetypes
β”‚   └── autonomous.py         # AutonomousAgent v2
β”œβ”€β”€ contracts/                # Solidity (0G Chain)
β”‚   β”œβ”€β”€ src/CGAERegistry.sol
β”‚   β”œβ”€β”€ src/CGAEEscrow.sol
β”‚   └── deployed.json
β”œβ”€β”€ storage/                  # 0G Storage
β”‚   β”œβ”€β”€ upload_to_0g.mjs
β”‚   └── zg_store.py
β”œβ”€β”€ server/                   # Simulation + API
β”‚   β”œβ”€β”€ runner.py             # Synthetic simulation
β”‚   β”œβ”€β”€ live_runner.py        # Live LLM simulation
β”‚   └── api.py                # FastAPI backend
β”œβ”€β”€ dashboard-next/           # Next.js frontend
β”‚   └── app/page.tsx
└── scripts/

```

## Tech Stack

| Layer | Technology |
|-------|-----------|
| Smart contracts | Solidity 0.8.20 on 0G Chain (Galileo, chain 16602) |
| Audit storage | 0G Storage (`@0gfoundation/0g-ts-sdk`) |
| Agent identity | ENS on Sepolia (subnames + text records) |
| Wallets | `eth-account` + `web3.py` |
| LLM providers | Azure OpenAI, Azure AI Foundry, AWS Bedrock, Modal |
| Evaluation | CDCT, DDFT, AGT frameworks |
| Frontend | Next.js + Tailwind + Recharts |
| Backend | FastAPI |
| Economy engine | Python |

## On-Chain vs Python-Side Accounting

| Component | Where it lives | Details |
|-----------|---------------|---------|
| Agent registration | **On-chain** (0G) | `CGAERegistry.registerAgent()` β€” wallet address + architecture hash |
| Robustness certification | **On-chain** (0G) | `CGAERegistry.certify()` β€” scores scaled to uint16 + 0G Storage Merkle root hash |
| Contract lifecycle | **On-chain** (0G) | `CGAEEscrow.createContract()` / `acceptContract()` / `completeContract()` / `failContract()` |
| ETH disbursements | **On-chain** (0G) | Real treasury β†’ agent wallet transfers |
| ENS identity | **On-chain** (Sepolia) | Subnames + 6 text records per agent (tier, CC, ER, AS, IH, wallet) |
| Audit certificates | **On-chain** (0G Storage) | Full audit JSON uploaded, Merkle root hash stored in CGAERegistry |
| Agent balances | **Python-side** | In-memory float on `AgentRecord`. Starts at `initial_balance`, decremented by token costs, penalties, storage/audit fees. Not read from chain. |
| Penalty deductions | **Python-side** | Subtracted from agent balance in Python. No on-chain clawback. |
| Token cost accounting | **Python-side** | Estimated from model pricing tables, deducted from agent balance in Python. |
| Tier gate enforcement | **Both** | Python `Economy.accept_contract()` checks tier + ENS. `CGAEEscrow.acceptContract()` also enforces tier + budget ceiling on-chain. |

**Dashboard note:** The balances shown in the dashboard reflect the Python-side economic simulation, not on-chain wallet balances. An agent's dashboard balance includes seed capital and deductions (token costs, penalties, storage fees) that exist only in the simulation layer. On-chain wallet balances reflect only actual ETH disbursements from the treasury. These numbers will differ.

## License

Research code β€” ETH OpenAgents Hackathon submission.