OmniCX Qwen2.5-3B LoRA (Research Preview)
Table of Contents
- Model Description
- Model Details
- Training Data
- Training Procedure
- Evaluation
- Intended Uses
- Out-of-Scope Uses
- Limitations
- Bias, Risks, and Safety
- How to Use
- Versioning
- Citation
Model Description
This model is a QLoRA fine-tune of Qwen/Qwen2.5-3B-Instruct for extracting structured logistics and customer-experience analytics from support transcripts.
The target output is a strict JSON object compatible with LogisticsCXMetrics (behavioral_analytics, operational_analytics, diagnostic_reasoning).
The output schema and taxonomy are derived from curated reference files:
Transcript-Only CX Difficulty Score_ Standards, Methods, and a Rigorous MVP Design.pdf
Deep-research document (ChatGPT-generated) on transcript-only CX friction signals and effort scoring methodology.Logistics CX Data Schema Development.docx
NotebookLM-assisted intent and schema research used to shape intent taxonomy and extraction field design.
These definitions are operationalized in src/schema.py and reflected in training labels.
Canonical taxonomy and rubric reference:
This release is a research preview, not a production-certified model.
Project repository: OmniCX-Extractor
Model Details
- Base model:
Qwen/Qwen2.5-3B-Instruct - Fine-tuning method: QLoRA (4-bit) via Unsloth
- Adapter format: LoRA adapter
- Primary use case: structured extraction for logistics CX research workflows
Training Data
- Main training artifact:
data/processed/golden_training_dataset.jsonl - Sample size (iteration shown): 486 examples
- Data format: ChatML-style messages with assistant JSON labels
- Label space source:
docs/knowledge/references (field/taxonomy source), mapped toLogisticsCXMetrics - Synthetic data pipeline model usage:
- Transcript generation:
gpt-4o-mini(src/data_factory.py) - Schema-constrained labeling:
gpt-4o-mini(src/extractor.py)
- Transcript generation:
Training Procedure
- Max sequence length: 2048
- Total steps: 150
- Effective batch size: 8
- Learning rate: 2e-4 (linear schedule)
- Optimizer:
adamw_8bit - Environment: single 8GB VRAM GPU setup (see training logs)
Detailed run record:
Evaluation
Current evaluation (research preview):
- Eval examples: 32
- Runtime errors: 0
- Strict exact-match accuracy: 0.0% (0/32)
- Mean latency: 29.84s/sample
- Min / max latency: 16.89s / 45.72s
- Total latency: 954.94s
Selected per-field accuracy:
customer_intent: 56.2%sentiment_trajectory: 65.6%address_change_requested: 100.0%escalation_requested: 100.0%
Detailed report:
Intended Uses
- Research and prototyping for logistics transcript understanding
- Structured extraction experiments under human review
- Error analysis and taxonomy tuning
Out-of-Scope Uses
- Autonomous production decisioning without human review
- Legal, financial, or regulatory adjudication
- High-risk customer-impacting automation
Limitations
- Small current eval set and strict metric sensitivity
- Potential mismatch to real-world transcript distribution
- Schema-conformant generation is not guaranteed in all cases
Bias, Risks, and Safety
- Synthetic or rubric-driven labels can encode design bias
- Output confidence is not calibrated for risk-critical decisions
- Use human oversight for escalations and customer-impacting actions
How to Use
Load adapter and run extraction (project-local)
from src.inference import load_model, extract_with_finetuned
model, tokenizer = load_model(model_path="models/qwen-logistics-lora")
result = extract_with_finetuned(
transcript="Agent: ... Customer: ...",
model=model,
tokenizer=tokenizer,
return_dict=True,
)
print(result)
Download from Hugging Face and run locally
from huggingface_hub import snapshot_download
from src.inference import load_model, extract_with_finetuned
local_model_dir = snapshot_download("mangesh-ux/omnicx-logistics-cx-extractor-qwen25-3b-lora")
model, tokenizer = load_model(model_path=local_model_dir)
result = extract_with_finetuned(
transcript="Agent: ... Customer: ...",
model=model,
tokenizer=tokenizer,
return_dict=True,
)
print(result)
Input and Output Contract
Input (single transcript):
{
"transcript": "Agent: ... Customer: ..."
}
Output (schema-aligned JSON):
{
"behavioral_analytics": {
"customer_intent": "WISMO_Standard",
"customer_effort_score": 2
},
"operational_analytics": {
"delivery_exception_type": "Unknown / Not Explicitly Stated",
"root_cause_category": "Unknown / Not Applicable",
"agent_explicitly_confirmed_resolution": true
},
"diagnostic_reasoning": {
"recommended_routing_queue": "Tier 1 Support"
}
}
The full field contract and enums are defined in src/schema.py.
Versioning
Recommended release naming:
v0.1.0- initial research previewv0.1.1+- format, eval, and quality refinements
Citation
@misc{omnicx_qwen25_lora_preview,
title = {OmniCX Qwen2.5-3B LoRA (Research Preview)},
author = {Mangesh Gupta},
year = {2026},
publisher = {Hugging Face},
note = {QLoRA fine-tune for logistics CX structured extraction}
}