Add model card and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: transformers
4
+ pipeline_tag: audio-to-audio
5
+ ---
6
+
7
+ # WavCube
8
+
9
+ WavCube is a 128-dim, 50Hz continuous representation that unifies speech understanding, reconstruction, and generation within a single space. It is presented in the paper [WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling](https://huggingface.co/papers/2605.06407).
10
+
11
+ - **Code:** [GitHub Repository](https://github.com/yanghaha0908/WavCube)
12
+ - **Paper:** [arXiv:2605.06407](https://arxiv.org/abs/2605.06407)
13
+
14
+ ## Usage
15
+
16
+ Before using the model, ensure you have installed the requirements as described in the [official repository](https://github.com/yanghaha0908/WavCube).
17
+
18
+ ### Extract Representation from Speech
19
+ You can get continuous representations from raw wav using the following command:
20
+
21
+ ```bash
22
+ python wav_to_feature.py \
23
+ --audio 19_198_000000_000002.wav \
24
+ --config configs/WavCube-stage2.yaml \
25
+ --ckpt WavCube/checkpoints/vocos_checkpoint_epoch=177_step=195000_val_loss=3.3080.ckpt \
26
+ --output 19_198_000000_000002.pt
27
+ ```
28
+
29
+ ### Reconstruct Speech from Representation
30
+ You can reconstruct waveform from representations using the following command:
31
+
32
+ ```bash
33
+ python feature_to_wav.py \
34
+ --feature 19_198_000000_000002.pt \
35
+ --config configs/WavCube-stage2.yaml \
36
+ --ckpt WavCube/checkpoints/vocos_checkpoint_epoch=177_step=195000_val_loss=3.3080.ckpt
37
+ ```
38
+
39
+ ## Citation
40
+
41
+ ```bibtex
42
+ @misc{yang2025wavcube,
43
+ title={WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling},
44
+ author={Haohan Yang and others},
45
+ year={2025},
46
+ eprint={2605.06407},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.SD},
49
+ url={https://arxiv.org/abs/2605.06407},
50
+ }
51
+ ```