GovOn Civil Adapter — EXAONE 4.0 QLoRA for Korean Government Civil Response Drafting
What is this adapter?
This LoRA adapter fine-tunes EXAONE 4.0-32B-AWQ to draft formal civil complaint responses for Korean local government officers.
Training Details
| Item | Value |
|---|---|
| Base Model | EXAONE 4.0-32B-AWQ |
| Method | Unsloth QLoRA (4-bit NF4) |
| LoRA Rank | r=16, alpha=32 |
| Target Modules | q, k, v, o, gate, up, down_proj (7) |
| Dataset | 74K civil complaint Q&A pairs |
| Hardware | HF Spaces A100 80GB |
| Final Loss | 0.889 (365 steps) |
Data Sources
- AI Hub 71852: Public Civil Service QA (29K)
- AI Hub 71847: Administrative Law QA (37K)
- Custom preprocessing via
parsers.py
How It Works in GovOn
GovOn uses vLLM Multi-LoRA serving. When the ReAct agent decides to call the public_admin_adapter tool, this LoRA is dynamically attached per-request:
User: 도로 파손 민원에 대한 답변을 작성해줘
→ Agent: tool_call(public_admin_adapter)
→ vLLM: attach civil-adapter LoRA
→ Formal response draft generated
A companion legal-adapter (siwo/govon-legal-adapter) handles law citation and evidence.
Project Context
Developed at Dong-A University, Dept. of Computer Science as an industry-collaboration capstone project for Korean public sector AI assistance.
| Resource | Link |
|---|---|
| GitHub | GovOn-Org/GovOn |
| Runtime Space | umyunsang/govon-runtime |
| Legal Adapter | siwo/govon-legal-adapter |
Feedback on training methodology, evaluation metrics, or potential improvements is very welcome!
Detailed Training Report & Lessons Learned
Why LoRA instead of full fine-tuning?
We considered three approaches for domain adaptation:
| Approach | GPU Memory | Training Time | Switching Cost | Quality |
|---|---|---|---|---|
| Full fine-tune | 160GB+ (multi-GPU) | Days | Must load separate model | Highest |
| QLoRA (our choice) | 48GB (single GPU) | ~7 hours | ~32MB swap, milliseconds | High |
| Prompt engineering | 0 (inference only) | None | None | Moderate |
For a university project deploying on a single A100 80GB, QLoRA was the clear winner. The key insight: adapter quality with 74K domain-specific pairs exceeded prompt engineering quality, while keeping the deployment footprint minimal.
Data Pipeline: From Raw Public Data to Training Pairs
The most underappreciated part of any fine-tuning project is data preparation. Our civil adapter data came from:
Source 1: AI Hub 71852 — Public Civil Service QA (29K)
- Raw format: XML with nested complaint/response pairs
- Challenge: Many responses were template-based with placeholder text
- Solution:
parsers.pyfilters responses shorter than 50 chars and strips boilerplate headers
Source 2: AI Hub 71847 — Administrative Law QA (37K)
- Raw format: JSON with legal terminology
- Challenge: Answers often cited laws without explaining them
- Solution: Paired with the legal dataset for cross-referencing
Final dataset composition:
- 74K training pairs after deduplication and quality filtering
- Average input length: ~150 tokens
- Average output length: ~300 tokens
- Format: EXAONE chat template with system prompt for civil servant role
Training Configuration Decisions
Why r=16? We tested r=8, r=16, and r=32:
- r=8: Faster training but noticeably lower response formality
- r=16: Sweet spot — formal government style retained with good generalization
- r=32: Marginal improvement over r=16, 2x parameter count
Why 7 target modules? Including gate/up/down_proj (MLP layers) in addition to attention (q/k/v/o) improved the model's ability to generate structured, formal text. Attention-only LoRA tended to produce more conversational responses.
Why 1 epoch? With 74K packed samples and effective batch size 64, the model converged by step 365. Validation loss plateaued at 0.889. A second epoch showed early signs of overfitting on template phrases.
What Could Be Improved
- Evaluation gap: We lack automated evaluation metrics (BLEU, BERTScore) comparing adapter vs base model responses. This is planned for the next iteration.
- Data diversity: Current data skews toward road/parking/noise complaints. Underrepresented categories (welfare, education) could benefit from targeted data collection.
- 2nd epoch on remaining data: Only 74K of the available ~110K civil pairs were used. A continued training run on the remainder could improve coverage.
Integration with GovOn Runtime
This adapter doesn't run standalone — it's served through vLLM Multi-LoRA:
POST /v3/agent/run
→ LLM decides to call public_admin_adapter tool
→ vLLM attaches this LoRA for the generation request
→ Response formatted as formal civil complaint answer
→ Agent returns to user (with approval in v4 mode)
The companion legal-adapter handles law citation when the agent calls the legal_adapter tool.
Questions about training methodology, data preprocessing, or LoRA configuration are very welcome!