Malaysian-Turn-Detector-Qwen3-0.6B

Fine-tuned Qwen3-0.6B for real-time turn-end detection in Malaysian multilingual call center conversations.

The model predicts P(<|im_end|>) — the probability that a speaker has finished their turn. Designed for low-latency voice agent pipelines (e.g. LiveKit) to determine when to respond.

How It Works

Given a conversation so far, the model outputs the probability of <|im_end|> as the next token:

P(im_end) > 0.5 → speaker is done talking (turn complete)
P(im_end) < 0.5 → speaker is still talking (turn incomplete)

Usage

import torch
import math
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Scicom-intl/Malaysian-Turn-Detector-Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16).cuda().eval()

IM_END_ID = tokenizer.convert_tokens_to_ids("<|im_end|>")

def get_turn_end_prob(text):
    """Returns probability that the speaker's turn is complete."""
    # Strip trailing <|im_end|> so the model predicts whether to emit it
    if text.endswith("<|im_end|>"):
        text = text[:-len("<|im_end|>")]
    inputs = tokenizer(text, return_tensors="pt").to("cuda")
    with torch.no_grad():
        logits = model(**inputs).logits
    prob = F.softmax(logits[0, -1], dim=-1)[IM_END_ID].item()
    return prob

# Complete turn - should be high probability
text = "<|im_start|>user\nHello, saya nak tanya pasal bil saya.<|im_end|>\n<|im_start|>assistant\nBoleh, sila berikan nombor akaun anda."
prob = get_turn_end_prob(text)
print(f"P(turn complete) = {prob:.4f}")  # ~0.74

# Incomplete turn - should be low probability
text = "<|im_start|>user\nHello, saya nak tanya pasal bil saya.<|im_end|>\n<|im_start|>assistant\nBoleh, sila berikan nombor"
prob = get_turn_end_prob(text)
print(f"P(turn complete) = {prob:.4f}")  # ~0.00

Eval Results

Test set: 1200 samples (600 positive + 600 negative), 50 conversations per language pair.

Overall (threshold = 0.5)

Metric	Score
Accuracy	97.00%
Precision	100.00%
Recall	94.00%
F1	96.91%

Per Language

Language Pair	Overall	Positive	Negative
chinese-english	97.00%	94.00%	100.00%
chinese-malay	97.00%	94.00%	100.00%
chinese-tamil	97.00%	94.00%	100.00%
english-chinese	100.00%	100.00%	100.00%
english-malay	97.00%	94.00%	100.00%
english-tamil	94.00%	88.00%	100.00%
malay-chinese	97.00%	94.00%	100.00%
malay-english	93.00%	86.00%	100.00%
malay-tamil	95.00%	90.00%	100.00%
tamil-chinese	100.00%	100.00%	100.00%
tamil-english	98.00%	96.00%	100.00%
tamil-malay	99.00%	98.00%	100.00%

Threshold Sweep

Threshold	Accuracy	Precision	Recall	F1
0.1	99.33%	99.66%	99.00%	99.33%
0.2	99.00%	99.66%	98.33%	98.99%
0.3	98.83%	100.00%	97.67%	98.82%
0.4	98.50%	100.00%	97.00%	98.48%
0.5	97.00%	100.00%	94.00%	96.91%
0.6	96.08%	100.00%	92.17%	95.92%
0.7	94.67%	100.00%	89.33%	94.37%
0.8	90.92%	100.00%	81.83%	90.01%
0.9	83.92%	100.00%	67.83%	80.83%

Probability Distribution

Class	Mean	Median	Min	Max
Positive (turn complete)	0.8817	0.9569	0.0046	0.9997
Negative (turn incomplete)	0.0010	0.0000	0.0000	0.2509

Training

Base model: Qwen/Qwen3-0.6B
Training data: Positive samples only (complete conversations ending with <|im_end|>)
Loss: Liger Fused Linear Cross Entropy
Attention: Flash Attention 2
Precision: bfloat16
Block size: 8192 (multipacked)
Batch size: 4 x 8 gradient accumulation
Learning rate: 2e-5 (constant)
Epochs: 1

Training Data Sources

Dataset	Source
Call Center Language Switching	Scicom-intl/Call-Center-Language-Switching
Function Call	Scicom-intl/Function-Call
Malaysian Multiturn Chat Assistant	mesolitica/Malaysian-Multiturn-Chat-Assistant
Malaysian Speech Instructions	mesolitica/Malaysian-Speech-Instructions

WandB

Source code

Source code at https://github.com/Scicom-AI-Enterprise-Organization/small-ablation/tree/main/turn-detector

Downloads last month: 345

Safetensors

Model size

0.6B params

Tensor type

BF16

Model tree for Scicom-intl/Malaysian-Turn-Detector-Qwen3-0.6B

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(794)

this model