๐ Kerala Crime Detective โ Malayalam + English + Manglish AI
Solve crimes the Kerala way โ Comedy, Manglish, and serious detective work, all in one model!
A fine-tuned Gemma 3 1B that understands Kerala crime reports, FIR details, and cyber fraud cases โ and responds in Malayalam, English, or Manglish with comedy and serious investigation steps.
๐ฏ What This Model Does
| Mode | Description |
|---|---|
| ๐ญ Malayalam Comedy | Solves crimes with Manglish humor, Kerala cultural references, local jokes |
| ๐ Serious English | Professional CID-style investigation โ evidence, suspects, legal sections |
| ๐ Cyber Crime Expert | Specialized in UPI fraud, SIM swap, sextortion, fake jobs, investment scams |
| ๐ญ+๐ Mixed Style | Comedy + serious advice combined โ most popular mode |
๐ Try the Live Demo
๐ Open in HuggingFace Spaces
๐ฆ Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-3-1b-it |
| Fine-tuning Method | Supervised Fine-Tuning (SFT) with TRL |
| Training Framework | HuggingFace TRL + Transformers |
| Hardware | Kaggle T4 GPU |
| Languages | Malayalam, English, Manglish |
| Parameters | ~1 Billion |
| Precision | bfloat16 |
| License | Apache 2.0 |
๐ป Quick Start
Installation
pip install transformers accelerate torch
Basic Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "wincode/kerala-crime-detective-gemma"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
attn_implementation="eager"
)
model.eval()
def solve_crime(crime_report: str, mode: str = "comedy") -> str:
system_prompts = {
"comedy": (
"You are Kerala Nadodikattu Detective, a comedy crime solver. "
"Solve crimes using Malayalam, Manglish and English mix. "
"Use humor, local references, Kerala culture."
),
"serious": (
"You are a senior Kerala Police CID officer. "
"Analyze crime reports professionally with evidence analysis, "
"suspect profiling, investigation steps and legal sections."
),
"cyber": (
"You are Kerala Cyber Cell's top investigator. "
"Specialize in UPI fraud, SIM swap, sextortion, fake jobs. "
"Provide immediate victim steps, recovery options, legal recourse."
),
}
messages = [
{"role": "system", "content": system_prompts.get(mode, system_prompts["comedy"])},
{"role": "user", "content": crime_report},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=400,
do_sample=True,
temperature=0.8,
top_p=0.9,
repetition_penalty=1.1,
pad_token_id=tokenizer.pad_token_id,
)
new_tokens = output_ids[0][inputs["input_ids"].shape[1]:]
return tokenizer.decode(new_tokens, skip_special_tokens=True)
# โโ Example 1: Malayalam Comedy Mode โโโโโโโโโโโโโโโโโโโโโโโโโ
report1 = """
Crime Report:
Location: Thrissur Pooram grounds
Time: 11 PM
Crime: 2kg gold ornaments missing from elephant caparison
Evidence: Footprints, torn dhoti piece
FIR: THR/2024/445
"""
print(solve_crime(report1, mode="comedy"))
# โโ Example 2: Cyber Crime Mode โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
report2 = """
Cyber Crime FIR:
Victim: Anitha, Homemaker, Palakkad
Crime: Received WhatsApp message saying I won Rs 50 lakh lottery.
They asked Rs 15,000 processing fee. I paid. Number now switched off.
Amount lost: Rs 15,000
"""
print(solve_crime(report2, mode="cyber"))
# โโ Example 3: Serious Investigation Mode โโโโโโโโโโโโโโโโโโโโโ
report3 = """
Hit and Run:
Location: NH-66 Kannur Bypass
Time: 11:30 PM
Victim: Biker, critical, ICU
Evidence: White paint transfer on victim bike
Witness: Trucker saw partial plate KL-13
"""
print(solve_crime(report3, mode="serious"))
๐๏ธ Training Dataset
Trained on the Kerala Crime Comedy Dataset โ a custom dataset covering:
| Category | Examples |
|---|---|
| ๐ญ Malayalam Comedy Crime | Gold theft, car theft, chain snatching, bicycle theft |
| ๐ป Cyber Crimes | Investment fraud, SIM swap, sextortion, OLX scam, fake jobs |
| ๐ Serious Crime Solving | Murder investigation, drug bust, hit and run |
| ๐ Property Crime | House breaking, land disputes |
| ๐ Support Cases | Domestic violence, senior citizen fraud, missing persons |
| ๐ Environmental Crime | Illegal sand mining |
| ๐ฐ Financial Fraud | Microfinance harassment, lottery fraud |
| ๐ Comedy Light Cases | Christmas star theft, goat eating crops, road rage |
Dataset stats:
- Total examples: 21+ (growing)
- Languages: Malayalam, English, Manglish
- Format: Chat (system + user + assistant)
- Average response length: 300-500 tokens
๐ Example Outputs
๐ญ Malayalam Comedy Mode
Input:
Crime: Gold chain snatching at Calicut beach. Victim: teacher. Suspect: young male on motorcycle.
Output:
โก Chain Snatching Investigation Report!
Chechi! Chain poyi aano?
Bike-il vannu, chain eduthu, bike-il poyi!
Classic Calicut evening crime!
๐ Evidence Analysis:
- Skid marks: Direction analysis possible
- Time: Evening = witnesses available
- Victim is teacher: Respected community member
Serious Steps:
โ
Alert gold shops in 20km radius IMMEDIATELY
โ
Deploy two-wheeler patrol on beach road
โ
CCTV from SM Street main junction
โ
Recovery chance: 60% if acted within 6 hours!
Chechi, we will get justice! ๐ช
๐ Cyber Crime Expert Mode
Input:
UPI fraud: Someone sent QR code saying it will send me money. I scanned and entered PIN. Rs 35,000 deducted.
Output:
๐ธ UPI QR Scam - Classic Trap!
CALL 1930 RIGHT NOW (Cyber Crime Helpline - Free, 24/7)
Every minute counts for account freeze!
How the scam worked:
In UPI there are TWO types of QR codes:
๐ฅ COLLECT request = You are PAYING (needs PIN = money GOES OUT)
๐ค PAYMENT = You are RECEIVING (NO PIN needed ever)
GOLDEN RULE: You NEVER enter PIN to RECEIVE money!
If someone asks PIN to send you money = 100% SCAM always!
Recovery steps:
1. Call 1930 immediately
2. Report to your bank fraud line
3. Screenshot the UPI ID and report on cybercrime.gov.in
Recovery probability: 50-60% if reported within 4 hours!
โ ๏ธ Limitations
- Model is fine-tuned on a small dataset (21 examples) โ responses may not always be perfectly formatted
- Malayalam script quality depends on base model's multilingual capability
- For real emergencies, always contact actual Kerala Police: 100 or Cyber Crime: 1930
- Model provides educational and entertainment value โ not a substitute for real legal advice
- Responses may vary due to sampling temperature
๐ก๏ธ Important Disclaimer
This model is for educational and entertainment purposes only.
For real crimes and emergencies:
- Police Emergency: 100
- Cyber Crime Helpline: 1930
- Women's Helpline: 1091
- Child Helpline: 1098
- Cybercrime Portal: cybercrime.gov.in
๐๏ธ Training Details
# Fine-tuning configuration used
sft_config = SFTConfig(
max_length=1024,
num_train_epochs=5,
per_device_train_batch_size=2,
gradient_accumulation_steps=8, # effective batch = 16
gradient_checkpointing=True,
learning_rate=2e-5,
lr_scheduler_type="cosine",
warmup_ratio=0.1,
weight_decay=0.01,
bf16=True,
optim="adamw_torch_fused",
)
Hardware used: Kaggle T4 GPU (15GB VRAM) Training time: ~25 minutes for 5 epochs
๐บ๏ธ Roadmap
- Expand dataset to 500+ examples
- Add more Malayalam script examples
- Add Manglish-only mode
- Support for audio input (voice crime reports)
- Add more cyber crime patterns (2024-2025 new scams)
- Quantized version (GGUF) for local deployment
- API endpoint for police department integration
๐ Related Resources
| Resource | Link |
|---|---|
| ๐ค Model | wincode/kerala-crime-detective-gemma |
| ๐ Dataset | wincode/kerala-crime-comedy-dataset |
| ๐ฎ Live Demo | Spaces: kerala-crime-detective |
| ๐๏ธ Base Model | google/gemma-3-1b-it |
๐ Credits
- Base Model: Google Gemma 3 โ Thank you Google DeepMind
- Fine-tuning: HuggingFace TRL โ SFTTrainer
- Training Platform: Kaggle โ Free T4 GPU
- Demo Framework: Gradio
- Inspiration: Kerala Police, Kerala comedy films, and every aunty who knows everything ๐
๐ License
This model is released under the Apache 2.0 License.
The base model Gemma 3 is subject to Google's Gemma Terms of Use.
Made with โค๏ธ in Kerala ๐ด | Nammude Kerala, Nammude Detective!
- Downloads last month
- 730