Pyannote Segmentation 3.0 β€” GGUF

Native GGUF port of pyannote/segmentation-3.0 for speaker diarization.

Model details

Property Value
Architecture SincNet + 4Γ— biLSTM + Linear + LogSoftmax
Format GGUF (F32)
Size 5.7 MB
Tensors 41
Output classes 7 (powerset mapping β†’ 3 speakers)
Input 10 s mono 16 kHz audio frames

The model performs joint voice-activity detection, speaker segmentation, and overlapped-speech detection on short audio chunks. Downstream clustering then produces full-file speaker diarization.

Usage with CrispASR

crispasr \
  --diarize-method pyannote \
  --sherpa-segment-model pyannote-seg-3.0.gguf \
  audio.wav

Provenance

Weights were exported directly from the original PyTorch checkpoint (pyannote/segmentation-3.0) into GGUF format, preserving full F32 precision across all 41 tensors.

License

MIT β€” same as the original pyannote-audio segmentation-3.0 model.

Downloads last month
102
GGUF
Model size
1.49M params
Architecture
pyannote_seg
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for cstr/pyannote-v3-segmentation-GGUF

Quantized
(2)
this model