You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

WavCochCausalV8192-vocoder

WavCoch is a causal waveform-to-cochleagram tokenizer by Greta Tuckute and Klemen Kotar.

Model Details

Parameter	Value
Parameters	~24.42M
Window Size	1001
Hop Length	80
Encoder Dim	512
Vocabulary Size	8192
Includes Vocoder	True

Usage

from transformers import AutoModel

wavcoch = AutoModel.from_pretrained(
    "TuKoResearch/WavCochCausalV8192-vocoder",
    trust_remote_code=True,
)

codes = wavcoch.quantize(waveform_tensor)
coch = wavcoch.decode(codes)
embeddings = wavcoch(
    input_values=waveform_tensor,
    output_hidden_states=True,
    sampling_rate=16000,
).hidden_states[0]

audio = wavcoch.decode_audio(codes)

Notes

This repo includes a bundled vocoder and supports decode_audio(...) for end-to-end waveform synthesis.

When called with output_hidden_states=True, WavCoch exposes a single hidden-state layer: the post-FSQ projected embedding sequence used for direct probing.

Downloads last month: 1,203