Spaces:
Paused
GovOn R1 Preview — EXAONE 4.0 + Multi-LoRA Agentic Shell for Korean Public Sector
What problem does GovOn solve?
Local government officers in South Korea handle dozens of civil complaints daily — road damage reports, parking disputes, noise complaints. Each response requires searching relevant laws, finding similar cases, and drafting formal documents.
GovOn automates this workflow with an AI agent.
Project Background
GovOn is developed at Dong-A University (Department of Computer Science) as an industry-collaboration capstone project, partnering with public sector organizations.
How It Works
1. ReAct Autonomous Agent
- LangGraph-based v4 architecture
- 7 tools: civil complaint lookup, issue detection, statistics, keyword analysis, demographics, civil-response adapter, legal adapter
- Human-in-the-loop approval before tool execution
2. Domain-Specific Multi-LoRA
- Base: EXAONE 4.0-32B-AWQ
- civil-adapter (this model): 74K civil complaint Q&A pairs → formal response drafting
- legal-adapter: 270K legal documents → law citation and evidence
- vLLM Multi-LoRA serving: per-request adapter switching
3. Production-Grade Engineering
- E2E 27/27 scenarios passing (6-Phase verification)
- DORA Elite grade (30 deploys/week, 0.9h lead time)
- 6-Layer context management for multi-turn conversations
Architecture
User Terminal (govon CLI)
|
v
HF Space — A100 80GB
+-------------------------+
| FastAPI :7860 |
| v4 ReAct Agent |
| + 6-Layer Context Mgmt |
| |
| vLLM :8000 |
| EXAONE 4.0-32B-AWQ |
| + civil LoRA (r16) |
| + legal LoRA (r16) |
+-------------------------+
Links
| Resource | URL |
|---|---|
| GitHub | GovOn-Org/GovOn |
| Civil Adapter | umyunsang/govon-civil-adapter |
| Legal Adapter | siwo/govon-legal-adapter |
| Docs Portal | govon-org.github.io/GovOn |
We want your feedback!
- What features would be most useful for government officers?
- Thoughts on the Multi-LoRA agentic architecture?
- Similar domain applications you would like to explore?
This project is open source. Stars, forks, and PRs are welcome!
Deep Dive: How GovOn's ReAct Agent Actually Works
The Problem We Observed
During field research at a local government office in Busan, South Korea, we observed civil complaint officers spending 20-30 minutes per response:
- Search — Finding relevant laws and regulations (Korean legal system has frequent amendments)
- Lookup — Searching similar past cases across fragmented databases
- Draft — Writing formal government-style responses with correct formatting
- Review — Cross-checking citations and department references
This is not a creative task. It is a structured, repeatable workflow — exactly the kind AI agents excel at.
Architecture Deep Dive
GovOn runs two agent graphs simultaneously:
| Graph | Endpoint | Use Case | Approval |
|---|---|---|---|
| v4 | /v2/agent/* | Production: human-in-the-loop | Required before tool execution |
| v3 | /v3/agent/* | Development/testing: auto-execute | None (fully autonomous) |
Both use the same ReAct loop pattern: the LLM observes the current state, reasons about what tool to call, acts by executing the tool, and then observes the result to decide the next step.
Multi-LoRA Serving — One Model, Multiple Experts
The key architectural decision was not to deploy separate models for each domain. Instead:
vLLM handles per-request LoRA switching with near-zero overhead. This means:
- One A100 GPU serves all capabilities
- Adapter switching takes milliseconds, not seconds
- Adding new domain adapters only requires training a new LoRA (~32MB each)
6-Layer Context Management
Long conversations are the Achilles heel of agent systems. After 3-5 turns, context windows overflow. We built a 6-layer defense:
| Layer | Stage | Mechanism |
|---|---|---|
| L1 | Tool execution | Truncate tool outputs to 3000 chars (head+tail) |
| L2 | Agent input | Clear old tool results with placeholder after iteration 2+ |
| L3 | Agent input | Reverse token-budget trim (4500 token budget) |
| L4 | Agent input | Hard cap — force remove from front if still over |
| L5 | Session load | Rule-based extractive summary of older messages |
| L6 | Session load | Permanent message removal via RemoveMessage |
This was inspired by production patterns from Claude API and Codex. The result: GovOn maintains coherent 5+ turn conversations without hallucination from context overflow.
Training Details
| Civil Adapter | Legal Adapter | |
|---|---|---|
| Base | EXAONE 4.0-32B | EXAONE 4.0-32B |
| Method | Unsloth QLoRA 4-bit NF4 | Unsloth QLoRA 4-bit NF4 |
| LoRA Config | r=16, alpha=32, 7 target modules | r=16, alpha=32, 7 target modules |
| Dataset | 74K civil Q&A pairs | 270K legal documents |
| Hardware | HF Spaces A100 80GB | HF Spaces A100 80GB |
| Final Loss | 0.889 | 0.889 |
Data sources: AI Hub (Korean government open datasets) — civil service QA, administrative law QA, civil/IP/criminal cases, and court precedents.
E2E Verification: 27 Scenarios, 6 Phases
We don't ship without evidence. Our E2E test suite covers:
| Phase | Scenarios | What It Tests |
|---|---|---|
| 1. Infrastructure | 3 | Health, base model, vLLM connection |
| 2. v2 Pipeline | 6 | Approval flow, rejection, multi-turn, concurrency |
| 3. v3 ReAct | 10 | Direct answer, tool execution, SSE streaming, iteration limits |
| 4. Cross-version | 2 | v2→v3 consistency, long query handling |
| 5. Multi-turn | 3 | Context retention, session isolation, 3-turn workflow |
| 6. Context Mgmt | 3 | Tool clearing, long query + clearing, 5-turn summarization |
Current status: 27/27 passing.
Want to try it?
The runtime is deployed on this HF Space. If the Space is active (A100 GPU), you can hit the API directly:
The space is paused, ask a maintainer to restart it
We welcome technical feedback on architecture, training methodology, or deployment strategies!