umyunsang/govon-civil-adapter · GovOn Civil Adapter — EXAONE 4.0 QLoRA for Korean Government Civil Response Drafting

GovOn Civil Adapter — EXAONE 4.0 QLoRA for Korean Government Civil Response Drafting

by umyunsang - opened 13 days ago

What is this adapter?

This LoRA adapter fine-tunes EXAONE 4.0-32B-AWQ to draft formal civil complaint responses for Korean local government officers.

Training Details

Item	Value
Base Model	EXAONE 4.0-32B-AWQ
Method	Unsloth QLoRA (4-bit NF4)
LoRA Rank	r=16, alpha=32
Target Modules	q, k, v, o, gate, up, down_proj (7)
Dataset	74K civil complaint Q&A pairs
Hardware	HF Spaces A100 80GB
Final Loss	0.889 (365 steps)

Data Sources

AI Hub 71852: Public Civil Service QA (29K)
AI Hub 71847: Administrative Law QA (37K)
Custom preprocessing via parsers.py

How It Works in GovOn

GovOn uses vLLM Multi-LoRA serving. When the ReAct agent decides to call the public_admin_adapter tool, this LoRA is dynamically attached per-request:

User: 도로 파손 민원에 대한 답변을 작성해줘
  → Agent: tool_call(public_admin_adapter)
    → vLLM: attach civil-adapter LoRA
      → Formal response draft generated

A companion legal-adapter (siwo/govon-legal-adapter) handles law citation and evidence.

Project Context

Developed at Dong-A University, Dept. of Computer Science as an industry-collaboration capstone project for Korean public sector AI assistance.

Resource	Link
GitHub	GovOn-Org/GovOn
Runtime Space	umyunsang/govon-runtime
Legal Adapter	siwo/govon-legal-adapter

Feedback on training methodology, evaluation metrics, or potential improvements is very welcome!

umyunsang

Owner 13 days ago

Detailed Training Report & Lessons Learned

Why LoRA instead of full fine-tuning?

We considered three approaches for domain adaptation:

Approach	GPU Memory	Training Time	Switching Cost	Quality
Full fine-tune	160GB+ (multi-GPU)	Days	Must load separate model	Highest
QLoRA (our choice)	48GB (single GPU)	~7 hours	~32MB swap, milliseconds	High
Prompt engineering	0 (inference only)	None	None	Moderate

For a university project deploying on a single A100 80GB, QLoRA was the clear winner. The key insight: adapter quality with 74K domain-specific pairs exceeded prompt engineering quality, while keeping the deployment footprint minimal.

Data Pipeline: From Raw Public Data to Training Pairs

The most underappreciated part of any fine-tuning project is data preparation. Our civil adapter data came from:

Source 1: AI Hub 71852 — Public Civil Service QA (29K)

Raw format: XML with nested complaint/response pairs
Challenge: Many responses were template-based with placeholder text
Solution: parsers.py filters responses shorter than 50 chars and strips boilerplate headers

Source 2: AI Hub 71847 — Administrative Law QA (37K)

Raw format: JSON with legal terminology
Challenge: Answers often cited laws without explaining them
Solution: Paired with the legal dataset for cross-referencing

Final dataset composition:

74K training pairs after deduplication and quality filtering
Average input length: ~150 tokens
Average output length: ~300 tokens
Format: EXAONE chat template with system prompt for civil servant role

Training Configuration Decisions

Why r=16? We tested r=8, r=16, and r=32:

r=8: Faster training but noticeably lower response formality
r=16: Sweet spot — formal government style retained with good generalization
r=32: Marginal improvement over r=16, 2x parameter count

Why 7 target modules? Including gate/up/down_proj (MLP layers) in addition to attention (q/k/v/o) improved the model's ability to generate structured, formal text. Attention-only LoRA tended to produce more conversational responses.

Why 1 epoch? With 74K packed samples and effective batch size 64, the model converged by step 365. Validation loss plateaued at 0.889. A second epoch showed early signs of overfitting on template phrases.

What Could Be Improved

Evaluation gap: We lack automated evaluation metrics (BLEU, BERTScore) comparing adapter vs base model responses. This is planned for the next iteration.
Data diversity: Current data skews toward road/parking/noise complaints. Underrepresented categories (welfare, education) could benefit from targeted data collection.
2nd epoch on remaining data: Only 74K of the available ~110K civil pairs were used. A continued training run on the remainder could improve coverage.

Integration with GovOn Runtime

This adapter doesn't run standalone — it's served through vLLM Multi-LoRA:

POST /v3/agent/run
  → LLM decides to call public_admin_adapter tool
    → vLLM attaches this LoRA for the generation request
      → Response formatted as formal civil complaint answer
        → Agent returns to user (with approval in v4 mode)

The companion legal-adapter handles law citation when the agent calls the legal_adapter tool.

Questions about training methodology, data preprocessing, or LoRA configuration are very welcome!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment