zuhri025's picture
Add model card
3c5b5e5 verified
metadata
license: other
tags:
  - audio
  - tts
  - vae
  - vocoder
  - safetensors
  - dramabox
  - resembleai
base_model: ResembleAI/Dramabox

Dramabox — Audio VAE + Vocoder

This repository contains a merged safetensors checkpoint extracted from ResembleAI/Dramabox.

It includes only the audio-generation weights:

Component Keys prefix Description
Audio VAE audio_vae.* Encoder / decoder VAE operating on mel-spectrograms (BF16)
Vocoder vocoder.vocoder.* HiFi-GAN style neural vocoder (BF16)
BWE Generator vocoder.bwe_generator.* Bandwidth extension generator (BF16)
Mel STFT vocoder.mel_stft.* Mel filterbank + STFT forward/inverse basis (BF16)

All weights are stored in BFloat16.

File

File Contents
dramabox-audiovae-vocoder.safetensors audio_vae + vocoder (merged)

Usage

from safetensors import safe_open

tensors = {}
with safe_open("dramabox-audiovae-vocoder.safetensors", framework="pt", device="cpu") as f:
    for key in f.keys():
        tensors[key] = f.get_tensor(key)

print(list(tensors.keys())[:5])

Source

Extracted from the original ResembleAI/Dramabox checkpoint. Please refer to the original repository for licensing details.