Piano Source Separation Model

This repository contains a 17 MB piano separation model and inference script for running it.

The model takes an audio track as input and outputs the isolated piano.

Examples

Listen to some examples here https://tjpurdy.github.io/Piano-Separation-Model-small/

Input and output

  • Supported input formats: wav, flac, mp3
  • Supported output formats: wav, flac (--output_format wav / --output_format flac)
  • --input_dir can point to either a single file or a directory containing multiple files

Installation

pip install torch einops rotary-embedding-torch numpy soundfile safetensors

Usage

Download the inference.py file then run the code below after setting the --input_dir (model and config will be auto-downloaded).

python inference.py --input_dir 'Insert path to file or directory containing file(s) here'

Extra options

  • --output_dir to choose where the outputs are saved, default is the same as --input_dir (output filenames will have _piano at the end)
  • --checkpoint_path where the model is located, if not found the code will automatically download it
  • --config_path where the config.json is located, if not found the code will automatically download it

Notes

  • This model is trained for the typical common piano only, it will not work on variants such as the electric piano.
  • Uses GPU (3GB VRAM required) automatically if available, CPU is used otherwise
  • The model is trained with 44.1 kHz audio
  • Processing speed of ~1 second per 1 minute of audio on a google colab T4.

Citation

Please cite this repository if you use this model in research or a project.

Credit

Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung - https://arxiv.org/abs/2309.02612 lucidrains - https://github.com/lucidrains/BS-RoFormer

train-loss

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for tjpurdy/Piano-Separation-Model-small