diegotg343 commited on
Commit
d2e9067
·
verified ·
1 Parent(s): 0733319

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +61 -3
  2. pitchflower_diagram.png +0 -0
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ ---
4
+
5
+ # 🌸 PitchFlower
6
+
7
+ <p align="left">
8
+ <a href="https://arxiv.org/abs/2510.25566">
9
+ <img src="https://img.shields.io/badge/arXiv-PitchFlower-b31b1b?logo=arxiv&logoColor=white" alt="arXiv">
10
+ </a>
11
+ <a href="https://github.com/diegotg2000/PitchFlower">
12
+ <img src="https://img.shields.io/badge/GitHub-PitchFlower-181717?logo=github" alt="GitHub">
13
+ </a>
14
+ </p>
15
+
16
+ Official pretrained checkpoint of the paper *PitchFlower: A flow-based neural audio codec with pitch controllability*.
17
+
18
+ ## 🧠 Overview
19
+
20
+ PitchFlower achieves pitch controllability by means of a perturbation strategy. During inference, pitch information is removed by applying a random flattening and shifting operation. The model is trained with a reconstruction task, providing pitch information explicitly.
21
+
22
+ <p align="center">
23
+ <img src="pitchflower_diagram.png" alt="PitchFlower architecture" width="600">
24
+ </p>
25
+
26
+ We use an autoencoder with an RVQ bottleneck and a flow-based decoder to produce high-quality audio. More details can be found in the paper.
27
+
28
+ ## 📦 Installation and Usage
29
+
30
+ Check out our GitHub repo to learn how to use PitchFlower https://github.com/diegotg2000/PitchFlower
31
+
32
+ ## 🙌 Acknowledgements
33
+
34
+ We'd like to acknowledge the repositories from which we draw inspiration and parts of the code
35
+
36
+ - Vocos: https://github.com/gemelo-ai/vocos
37
+ - WavTokenizer: https://github.com/jishengpeng/WavTokenizer
38
+ - Encodec: https://github.com/facebookresearch/encodec
39
+
40
+ This work has been done in the [Analysis/Synthesis team of the STMS laboratory](https://www.stms-lab.fr/team/analyse-et-synthese-des-sons/) at IRCAM. It has been funded by the [ANR project EVA](https://anr.fr/Project-ANR-23-CE23-0018).
41
+
42
+ ## 📫 Contact
43
+
44
+ For questions or collaboration opportunities, feel free to reach out: dtorres@ircam.fr
45
+
46
+ ## 🧩 Citation
47
+
48
+ ```bibtex
49
+ @misc{pitchflower,
50
+ title={PitchFlower: A flow-based neural audio codec with pitch controllability},
51
+ author={Diego Torres and Axel Roebel and Nicolas Obin},
52
+ year={2025},
53
+ eprint={2510.25566},
54
+ archivePrefix={arXiv},
55
+ url={https://arxiv.org/abs/2510.25566},
56
+ }
57
+ ```
58
+
59
+ ## 📜 License
60
+
61
+ This project is licensed under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
pitchflower_diagram.png ADDED