keshavbhandari commited on
Commit
5f1b1af
·
verified ·
1 Parent(s): 310acdd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -0
README.md CHANGED
@@ -1,3 +1,133 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - music
7
+ - text-to-music
8
+ - sheet-music
9
+ - pytorch
10
+ datasets:
11
+ - emotionwave-company/text2score
12
  ---
13
+
14
+ # Text2Score: Generating Sheet Music From Textual Prompts
15
+
16
+ [![GitHub Repo](https://img.shields.io/badge/GitHub-Codebase-blue)](https://github.com/keshavbhandari/text2score)
17
+ [![Demo](https://img.shields.io/badge/Demo-Live-brightgreen)](https://keshavbhandari.github.io/portfolio/text2score)
18
+ [![Dataset](https://img.shields.io/badge/Dataset-HuggingFace-yellow)](https://huggingface.co/datasets/emotionwave-company/text2score)
19
+ [![Paper](https://img.shields.io/badge/Paper-Arxiv_TBD-red)](#)
20
+
21
+ This repository hosts the pre-trained model weights for **Text2Score**, a model designed to generate sheet music directly from text prompts.
22
+
23
+ **Note on Usage:** To use this model, you do not need to download these weights manually. The inference scripts in our primary GitHub repository are configured to automatically download this checkpoint the first time you run them.
24
+
25
+ For the full codebase, issue tracking, and detailed system architecture, please visit our [GitHub Repository](https://github.com/keshavbhandari/text2score).
26
+
27
+ ---
28
+
29
+ ## System Overview
30
+ For a high-level view of the model architecture and pipeline, please see the system overview below:
31
+
32
+ ![System Overview](https://raw.githubusercontent.com/keshavbhandari/text2score/main/text2music/artifacts/system_overview.png) *(Note: Ensure this link points to the raw image on your GitHub)*
33
+
34
+ ---
35
+
36
+ ## Quick Start & Installation
37
+
38
+ To run inference or train the model, you will need to clone our GitHub repository and set up the environment.
39
+
40
+ **1. Clone the GitHub repository**
41
+ ```bash
42
+ git clone [https://github.com/keshavbhandari/text2score.git](https://github.com/keshavbhandari/text2score.git)
43
+ cd text2score
44
+ ```
45
+
46
+ **2. Create and activate a new Conda environment**
47
+ ```bash
48
+ conda create --name text2score python=3.10
49
+ conda activate text2score
50
+ ```
51
+
52
+ **3. Install PyTorch with CUDA support**
53
+ ```bash
54
+ conda install pytorch=2.3.0 pytorch-cuda=11.8 numpy -c pytorch -c nvidia
55
+ ```
56
+
57
+ **4. Install the project and dependencies**
58
+ ```bash
59
+ pip install -e .
60
+ pip install optimum
61
+ ```
62
+
63
+ ---
64
+
65
+ ## Inference & Usage
66
+
67
+ *All commands below should be executed from within the root `text2score` directory of the cloned GitHub repository.*
68
+
69
+ ### 1. Generating Plans from Prompts
70
+ Before running batch inference, you can generate an execution plan based on a JSON of prompts.
71
+ ```bash
72
+ python text2music/inference/generate_plan.py \
73
+ --api_key "XXXX" \
74
+ --input_json text2music/artifacts/evaluation/prompts_with_ids.json \
75
+ --output_json text2music/artifacts/evaluation/prompts_with_plan.json
76
+ ```
77
+
78
+ ### 2. Single Inference
79
+ To generate a score from a single text prompt:
80
+ ```bash
81
+ python text2music/inference/inference.py \
82
+ --user_prompt "A melancholic solo flute melody settling back into D minor, composed as a slow 6/8 barcarolle at 54 BPM." \
83
+ --api_key "XXXX" \
84
+ --remove_prior_outputs False
85
+ ```
86
+ **Important:** You will need an active OpenAI API key to run the script this way, as it automatically generates the necessary execution plans on the fly. The script currently defaults to using the `GPT-5.1` model for plan generation, but you can easily modify the script to use any other supported model.
87
+
88
+ Or if you already have a pre-generated plan text file, use:
89
+ ```bash
90
+ python text2music/inference/inference.py \
91
+ --plan_path text2music/artifacts/example_plans/partial_plan.txt \
92
+ --remove_prior_outputs False
93
+ ```
94
+
95
+ *Tip: If you prefer to generate a plan manually, you can copy the system prompt found in `text2score/text2music/inference/prompt.py` (add your own prompt in the placeholder text) and paste it into the interface for ChatGPT, Gemini, or any other LLM of your choice along with your desired music prompt. Simply take the LLM's output, replace the contents of `text2score/text2music/artifacts/example_plans/partial_plan.txt` entirely with that new plan, and run the command above.*
96
+
97
+ ### 3. Batch Inference
98
+ Run inference across multiple prompts using `accelerate`:
99
+ ```bash
100
+ accelerate launch --num_processes 1 text2music/inference/run_inference.py \
101
+ --prompt_path_json text2music/artifacts/evaluation/prompts_with_plan.json \
102
+ --output_folder ./outputs/
103
+ ```
104
+
105
+ ---
106
+
107
+ ## Data Conversion (ABC to XML & MIDI)
108
+
109
+ Once you have generated ABC files, you can batch convert them into standard XML and MIDI formats.
110
+
111
+ ```bash
112
+ # Create XML files
113
+ python text2music/data/batch_abci2xml.py --root_folder ./outputs/
114
+
115
+ # Create MIDI files
116
+ python text2music/data/utils/xml2mid.py --data_dir ./outputs/
117
+ ```
118
+
119
+ ---
120
+
121
+ ## Citation
122
+
123
+ If you find this model or repository useful in your research, please consider citing our work:
124
+
125
+ ```bibtex
126
+ @article{bhandari2025text2score,
127
+ title = {Text2Score: Generating Sheet Music from Textual Prompts},
128
+ author = {Bhandari, Keshav and Chang, Sungkyun and Roy, Abhinaba and Ronchini, Francesca and Benetos, Emmanouil and Herremans, Dorien and Colton, Simon},
129
+ journal = {arXiv preprint},
130
+ year = {2025},
131
+ note = {arXiv link coming soon}
132
+ }
133
+ ```