Intent Router (ONNX int8)

A 7-class intent classifier for code query routing. Classifies natural language queries (English and Chinese) into structured intents for code intelligence tools.

Intents

Label	Description
`locate_symbol`	Find symbol definitions
`find_references`	Trace reverse references / impact
`trace_dependencies`	Trace forward dependencies / call chains
`semantic_search`	Semantic search over code and docs
`browse_structure`	Browse package / module structure
`cross_layer_trace`	Map between code and business docs
`ambiguous`	Query cannot be classified

Files

File	Required	Description
`onnx/model.onnx`	Yes	ONNX model graph
`onnx/model.onnx_data`	Yes	Model weights (int8 quantized)
`model_head.json`	Yes	Classification head (weights + bias)
`tokenizer.json`	Yes	Tokenizer
`tokenizer_config.json`	Yes	Tokenizer configuration
`labels.json`	Yes	Intent label list
`config.json`	Yes	Model configuration

Inference

Requires ONNX Runtime. The model takes tokenized text input and outputs sentence embeddings. The classification head (model_head.json) maps embeddings to intent logits.

input text → tokenizer → ONNX model → embedding → classification head → intent + confidence

Benchmark

Evaluated on a held-out test set of 221 bilingual (Chinese + English) code queries.

Metric	Value
Overall accuracy	96.8% (214/221)
Inference latency (CPU, ONNX Runtime)	~3ms p50

Per-intent performance:

Intent	Precision	Recall	F1
locate_symbol	98.4%	96.8%	0.976
find_references	95.1%	97.5%	0.963
trace_dependencies	90.2%	95.1%	0.926
semantic_search	100.0%	91.5%	0.956
browse_structure	91.7%	100.0%	0.957
cross_layer_trace	100.0%	100.0%	1.000
ambiguous	100.0%	100.0%	1.000

Training set: 284 samples. Test set: 221 samples.

Quantization

int8 (dynamic quantization). Total size ~559MB.

Related Project

This model is fine-tuned for C4A (Context For AI), a knowledge modeling service that indexes code repositories and business documents for developer teams and AI agents.

License

MIT

Downloads last month: 78