RAS1981/qwen3-0.6b-turn-detection-v3

This model is a fine-tuned version of RAS1981/qwen3-0.6b-turn-detection-v1. It has been trained on an expanded dataset, incorporating approximately 50,000 additional examples to improve robustness and generalization in conversational turn detection.

Model Details

Base Model: RAS1981/qwen3-0.6b-turn-detection-v1
Training Data: Original V1 dataset + ~50k new examples.
Task: Turn Detection (Binary Classification via Next-Token Prediction).
Language: Russian (primary evaluation context), English.
Architecture: Qwen3-0.6B (Transformer).

Intended Use

This model is designed for real-time voice bots to detect when a user has finished speaking. It predicts the probability of the <|im_end|> token at the end of a text segment.

Input: ASR transcript of the user's speech.
Output: Probability of turn completion.
Threshold: 0.5 (EOS Probability > 0.5 indicates "Turn Finished").

Evaluation Results

The model was evaluated on the same 75-sample test set used for V2, categorized into:

G1 (FINISHED): Completed sentences (Expected: END).
G2 (UNFINISHED): Incomplete sentences (Expected: WAIT).
G3 (PAUSE): Pauses/fillers (Expected: WAIT).

Summary Metrics

Total Samples: 75
Correct Predictions: 43 (57.3%)
Failures: 32 (42.7%)
Threshold: 0.5

Metric	Count	Percentage	Description
True Negative	20	26.7%	Correctly identified incomplete turn (WAIT)
False Positive	32	42.7%	Incorrectly identified incomplete turn as finished (Interruption)
False Negative	0	0.0%	Incorrectly identified finished turn as incomplete (Latency)
True Positive	23	30.7%	Correctly identified finished turn (END)

Performance by Group

Group	Total	Correct	Incorrect	Accuracy	Precision	Recall	F1
G1 (Finished)	23	23	0	100.0%	1.00	1.00	1.00
G2 (Unfinished)	42	20	22	47.6%	0.00	0.00	0.00
G3 (Pause)	10	0	10	0.0%	0.00	0.00	0.00

Analysis & Comparison to V2

Despite the addition of 50k training examples, V3 shows slightly lower accuracy (57.3% vs 60.0%) compared to V2 on this specific test set.

G1 (Finished): Maintains perfect performance (100%). The model never misses a true end-of-turn.
G2 (Unfinished): Accuracy dropped slightly (47.6% vs 50.0%). The model remains overly aggressive in predicting completion.
G3 (Pause): Performance dropped to 0%. The model now misclassifies all pauses/fillers as completed turns.

Key Observations

The increased dataset size seems to have biased the model further towards predicting "Complete." This might be due to:

Data Imbalance: The new 50k examples likely contain a high proportion of completed turns or "clean" text, reinforcing the bias against incomplete/messy speech.
Overfitting to Completeness: The model has become extremely confident in predicting EOS, often assigning >99% probability to incomplete sentences (e.g., "Поэтому для начала очень важно, чтобы там находилось это." -> 99.27%).

Failure Patterns

The failures are identical in nature to V2 but often with higher confidence:

Text: "...чтобы там находилось это." (EOS: 0.99)
Text: "...какие варианты у вас." (EOS: 0.99)
Text: "...для меня слишком." (EOS: 0.99)

The model ignores semantic cues ("для начала", "чтобы") that signal continuation, treating almost any syntactically plausible clause end as a turn end.

Recommendations for V4

To correct this "interruption bias":

Resample Training Data: Drastically reduce the number of "Complete" examples or upsample "Incomplete" examples.
Hard Negative Mining: Generate synthetic "Incomplete" examples by cutting off valid sentences at high-probability points (conjunctions, prepositions) and labeling them as WAIT.
Pause-Specific Training: Explicitly fine-tune on a dataset of fillers ("э-э", "ну", "м-м") labeled as incomplete.

How to Use (Inference)

from unsloth import FastLanguageModel
import torch

model_name = "RAS1981/qwen3-0.6b-turn-detection-v3"
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    load_in_4bit=True,
    max_seq_length=2048,
)
EOS_ID = 151645 # <|im_end|>

def get_turn_probability(text):
    messages = [
        {"role": "system", "content": "Ты определяешь конец реплики пользователя по смыслу."},
        {"role": "user", "content": text}
    ]
    # Important: Disable thinking and strip trailing EOS for prediction
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False, enable_thinking=False)
    inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")
    
    # Strip auto-added EOS if present
    if inputs.input_ids[0][-1] == EOS_ID:
        inputs.input_ids = inputs.input_ids[:, :-1]
        
    with torch.no_grad():
        logits = model(**inputs).logits[:, -1, :]
        probs = torch.softmax(logits, dim=-1)
        eos_prob = probs[0, EOS_ID].item()
        
    return eos_prob

text = "Алло, здравствуйте"
print(f"Turn Probability: {get_turn_probability(text):.4f}")

Downloads last month: 3

Safetensors

Model size

0.6B params

Tensor type

BF16

Model tree for RAS1981/qwen3-0.6b-turn-detection-v3

Base model

RAS1981/qwen3-0.6b-turn-detection-v1

Finetuned

(1)

this model