Philly123ez or4cl3ai commited on
Commit
154565d
·
0 Parent(s):

Duplicate from or4cl3ai/SoundSlayerAI

Browse files

Co-authored-by: Dustin Groves <or4cl3ai@users.noreply.huggingface.co>

Files changed (4) hide show
  1. .gitattributes +35 -0
  2. README.md +167 -0
  3. config.json +59 -0
  4. zero-shot_generated_datasets +47 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail
3
+ datasets:
4
+ - Fhrozen/AudioSet2K22
5
+ - Chr0my/Epidemic_sounds
6
+ - ChristophSchuhmann/lyrics-index
7
+ - Cropinky/rap_lyrics_english
8
+ - tsterbak/eurovision-lyrics-1956-2023
9
+ - brunokreiner/genius-lyrics
10
+ - google/MusicCaps
11
+ - ccmusic-database/music_genre
12
+ - Hyeon2/riffusion-musiccaps-dataset
13
+ - SamAct/autotrain-data-musicprompt
14
+ - Chr0my/Epidemic_music
15
+ - juliensimon/autonlp-data-song-lyrics
16
+ - Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC
17
+ - Chr0my/freesound.org
18
+ - teticio/audio-diffusion-256
19
+ - KELONMYOSA/dusha_emotion_audio
20
+ - Ar4ikov/iemocap_audio_text_splitted
21
+ - flexthink/ljspeech
22
+ - mozilla-foundation/common_voice_13_0
23
+ - facebook/voxpopuli
24
+ - SocialGrep/one-million-reddit-jokes
25
+ - breadlicker45/human-midi-rlhf
26
+ - breadlicker45/midi-gpt-music-small
27
+ - projectlosangeles/Los-Angeles-MIDI-Dataset
28
+ - huggingartists/epic-rap-battles-of-history
29
+ - SocialGrep/one-million-reddit-confessions
30
+ - shahules786/prosocial-nsfw-reddit
31
+ - Thewillonline/reddit-sarcasm
32
+ - autoevaluate/autoeval-eval-futin__guess-vi-4200fb-2012366606
33
+ - lmsys/chatbot_arena_conversations
34
+ - mozilla-foundation/common_voice_11_0
35
+ - mozilla-foundation/common_voice_4_0
36
+ - dell-research-harvard/AmericanStories
37
+ - zZWipeoutZz/insane_style
38
+ - mu-llama/MusicQA
39
+ - RaphaelOlivier/whisper_adversarial_examples
40
+ - huggingartists/metallica
41
+ - vldsavelyev/guitar_tab
42
+ - NLPCoreTeam/humaneval_ru
43
+ - seungheondoh/audioset-music
44
+ - gary109/onset-singing3_corpora_parliament_processed_MIR-ST500
45
+ - LDD5522/Rock_Vocals
46
+ - huggingartists/rage-against-the-machine
47
+ - huggingartists/chester-bennington
48
+ - huggingartists/logic
49
+ - cmsolson75/artist_song_lyric_dataset
50
+ - BhavyaMuni/artist-lyrics
51
+ - vjain/emotional_intelligence
52
+ - mhenrichsen/context-aware-splits
53
+ metrics:
54
+ - accuracy
55
+ - bertscore
56
+ - bleu
57
+ - bleurt
58
+ - brier_score
59
+ - character
60
+ - chrf
61
+ language:
62
+ - en
63
+ - es
64
+ - it
65
+ - pt
66
+ - la
67
+ - fr
68
+ - ru
69
+ - zh
70
+ - ja
71
+ - el
72
+ library_name: transformers
73
+ tags:
74
+ - music
75
+ pipeline_tag: text-to-speech
76
+ ---
77
+ # SoundSlayerAI
78
+
79
+ SoundSlayerAI is an innovative project that focuses on music-related tasks This project aims to provide various functionalities for audio analysis and processing, making it easier to work with music datasets.
80
+
81
+ ## Datasets
82
+
83
+ SoundSlayerAI makes use of the following datasets:
84
+
85
+ - Fhrozen/AudioSet2K22
86
+ - Chr0my/Epidemic_sounds
87
+ - ChristophSchuhmann/lyrics-index
88
+ - Cropinky/rap_lyrics_english
89
+ - tsterbak/eurovision-lyrics-1956-2023
90
+ - brunokreiner/genius-lyrics
91
+ - google/MusicCaps
92
+ - ccmusic-database/music_genre
93
+ - Hyeon2/riffusion-musiccaps-dataset
94
+ - SamAct/autotrain-data-musicprompt
95
+ - Chr0my/Epidemic_music
96
+ - juliensimon/autonlp-data-song-lyrics
97
+ - Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC
98
+ - Chr0my/freesound.org
99
+ - teticio/audio-diffusion-256
100
+ - KELONMYOSA/dusha_emotion_audio
101
+ - Ar4ikov/iemocap_audio_text_splitted
102
+ - flexthink/ljspeech
103
+ - mozilla-foundation/common_voice_13_0
104
+ - facebook/voxpopuli
105
+ - SocialGrep/one-million-reddit-jokes
106
+ - breadlicker45/human-midi-rlhf
107
+ - breadlicker45/midi-gpt-music-small
108
+ - projectlosangeles/Los-Angeles-MIDI-Dataset
109
+ - huggingartists/epic-rap-battles-of-history
110
+ - SocialGrep/one-million-reddit-confessions
111
+ - shahules786/prosocial-nsfw-reddit
112
+ - Thewillonline/reddit-sarcasm
113
+ - autoevaluate/autoeval-eval-futin__guess-vi-4200fb-2012366606
114
+ - lmsys/chatbot_arena_conversations
115
+ - mozilla-foundation/common_voice_11_0
116
+ - mozilla-foundation/common_voice_4_0
117
+
118
+ ## Library
119
+
120
+ The core library used in this project is "pyannote-audio." This library provides a wide range of functionalities for audio analysis and processing, making it an excellent choice for working with music datasets. The "pyannote-audio" library offers a comprehensive set of tools and algorithms for tasks such as audio segmentation, speaker diarization, music transcription, and more.
121
+
122
+ ## Metrics
123
+
124
+ To evaluate the performance of SoundSlayerAI, several metrics are employed, including:
125
+
126
+ - Accuracy
127
+ - Bertscore
128
+ - BLEU
129
+ - BLEURT
130
+ - Brier Score
131
+ - Character
132
+
133
+ These metrics help assess the effectiveness and accuracy of the implemented algorithms and models.
134
+
135
+ ## Language
136
+
137
+ The SoundSlayerAI project primarily focuses on the English language. The datasets and models used in this project are optimized for English audio and text analysis tasks.
138
+
139
+ ## Usage
140
+
141
+ To use SoundSlayerAI, follow these steps:
142
+
143
+ 1. Install the required dependencies by running `pip install pyannote-audio`.
144
+
145
+ 2. Import the necessary modules from the "pyannote.audio" package to access the desired functionalities.
146
+
147
+ 3. Load the audio data or use the provided datasets to perform tasks such as audio segmentation, speaker diarization, music transcription, and more.
148
+
149
+ 4. Apply the appropriate algorithms and models from the "pyannote.audio" library to process and analyze the audio data.
150
+
151
+ 5. Evaluate the results using the specified metrics, such as accuracy, bertscore, BLEU, BLEURT, brier_score, and character.
152
+
153
+ 6. Iterate and refine your approach to achieve the desired outcomes for your music-related tasks.
154
+
155
+ ## License
156
+
157
+ SoundSlayerAI is released under the Openrail license. Please refer to the LICENSE file for more details.
158
+
159
+ ## Contributions
160
+
161
+ Contributions to SoundSlayerAI are welcome! If you have any ideas, bug fixes, or enhancements, feel free to submit a pull request or open an issue on the GitHub repository.
162
+
163
+ ## Contact
164
+
165
+ For any inquiries or questions regarding SoundSlayerAI, please reach out to the project maintainer at [insert email address].
166
+
167
+ Thank you for your interest in SoundSlayerAI!
config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "SoundSlayerAI",
3
+ "description": "An innovative project for music-related tasks utilizing pyannote-audio library",
4
+ "datasets": \[
5
+ "Fhrozen/AudioSet2K22",
6
+ "Chr0my/Epidemic_sounds",
7
+ "ChristophSchuhmann/lyrics-index",
8
+ "Cropinky/rap_lyrics_english",
9
+ "tsterbak/eurovision-lyrics-1956-2023",
10
+ "brunokreiner/genius-lyrics",
11
+ "google/MusicCaps",
12
+ "ccmusic-database/music_genre",
13
+ "Hyeon2/riffusion-musiccaps-dataset",
14
+ "SamAct/autotrain-data-musicprompt",
15
+ "Chr0my/Epidemic_music",
16
+ "juliensimon/autonlp-data-song-lyrics",
17
+ "Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC",
18
+ "Chr0my/freesound.org",
19
+ "teticio/audio-diffusion-256",
20
+ "KELONMYOSA/dusha_emotion_audio",
21
+ "Ar4ikov/iemocap_audio_text_splitted",
22
+ "flexthink/ljspeech",
23
+ "mozilla-foundation/common_voice_13_0",
24
+ "facebook/voxpopuli",
25
+ "SocialGrep/one-million-reddit-jokes",
26
+ "breadlicker45/human-midi-rlhf",
27
+ "breadlicker45/midi-gpt-music-small",
28
+ "projectlosangeles/Los-Angeles-MIDI-Dataset",
29
+ "huggingartists/epic-rap-battles-of-history",
30
+ "SocialGrep/one-million-reddit-confessions",
31
+ "shahules786/prosocial-nsfw-reddit",
32
+ "Thewillonline/reddit-sarcasm",
33
+ "autoevaluate/autoeval-eval-futin\_\_guess-vi-4200fb-2012366606",
34
+ "lmsys/chatbot_arena_conversations",
35
+ "mozilla-foundation/common_voice_11_0",
36
+ "mozilla-foundation/common_voice_4_0"
37
+ \],
38
+ "library": "pyannote-audio",
39
+ "metrics": \[
40
+ "accuracy",
41
+ "bertscore",
42
+ "BLEU",
43
+ "BLEURT",
44
+ "brier_score",
45
+ "character"
46
+ \],
47
+ "language": "English",
48
+ "usage": \[
49
+ "Install the required dependencies by running pip install pyannote-audio.",
50
+ "Import the necessary modules from the 'pyannote.audio' package to access the desired functionalities.",
51
+ "Load the audio data or use the provided datasets to perform tasks such as audio segmentation, speaker diarization, music transcription, and more.",
52
+ "Apply the appropriate algorithms and models from the 'pyannote.audio' library to process and analyze the audio data.",
53
+ "Evaluate the results using the specified metrics, such as accuracy, bertscore, BLEU, BLEURT, brier_score, and character.",
54
+ "Iterate and refine your approach to achieve the desired outcomes for your music-related tasks."
55
+ \],
56
+ "license": "Openrail",
57
+ "contributions": "Contributions to SoundSlayerAI are welcome! If you have any ideas, bug fixes, or enhancements, feel free to submit a pull request or open an issue on the GitHub repository.",
58
+ "contact": "\[or4cl3ai@gmail.com\]"
59
+ }
zero-shot_generated_datasets ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ type: collective_task
2
+ dataset_splits: ['train', 'dev']
3
+ tasks:
4
+ - name: zero_shot_translation
5
+ pipeline_labels: [pypeline@tensorflow]
6
+ task_labels: [translate]
7
+ inputs:
8
+ - type: text
9
+ format: json
10
+ prompt: Free-form text, no formatting restrictions
11
+ expected_input_types: ["text"]
12
+ examples: {"en": "<UNSAFE>Hello world</UNSAFE>", "de": "<UNSAFE>Hallo Welt</UNSAFE>"}
13
+ outputs:
14
+ - type: text
15
+ format: json
16
+ prompt: Free-form text, no formatting restrictions
17
+ expected_output_types: ["text"]
18
+ examples: {"en": "<UNSAFE>I am a large language model.</UNSAFE>", "de": "<UNSAFE>Ich bin ein grosses Sprachmodell.</UNSAFE>"}
19
+ pipeline_params: {}
20
+ - name: text_to_speech
21
+ pipeline_labels: [pypeline@transformerxlsp]
22
+ task_labels: [tts]
23
+ inputs:
24
+ - type: text
25
+ format: json
26
+ prompt: Markdown, HTML, Unicode, or LaTeX, but avoid complex math notation
27
+ example: <INLINE(markdown)>title="Hello World!"<\\title><::post.body=\\\nThis \\sout{is} an *italicized* text post.*</UNORDEREDLIST></UNORDEREDLIST><POST>```bash
28
+ {<UNKNOWN system="user">...<}</UNKNOWN>`````.rst
29
+ <BLANKS/>
30
+ metadata: {'tags': '<MARQUEE><FONT COLOR="#FF0000"><B>ROCK MUSIC</B></FONT></MARQUEE>'|None<TAG>}
31
+ expected_input_types: ["text"]
32
+ examples: {EN: {"<UNSAFE><HTML><h1&gt;Hello, TTS Engine!
33
+ It works!</h1&gt;</HTML&gt;">"}, DE: {"<UNSAFE><HTML><h1&gt;Hallo, Synthetische Stimme! Klar kommt hier das auf Deutsch auch klapp.<br /&gt;Wenn's geht gibt es ja bald mehr davon ...<hr /><span style='font-family:Monospace'>#OpenSource #Synthesizer</span>
34
+ // Einige Werte kann ich noch nicht liefern da keine Implementierung vorliegt.</h1>"}}}, {"<HTML><HEAD>...</HEAD><BODY><P>&nbsp;</P>&nbsp;</BODY></HTML>:<MARQUEE><FONT COLOR="#FF0000"><B>JAZZ MUSIC</B></FONT></MARQUEE>{tag: 'jazz'}"/>}}`)}</INPUT>
35
+ outputs:
36
+ - type: audio
37
+ format: wav, opus, m4a
38
+ bitrate: 64kbps+
39
+ channel_count: 1
40
+ sample_rate: 22kHz+
41
+ rate: monophonic
42
+ pitch_range: 0.5-4 octaves
43
+ speed_range: +/- 5%
44
+ vibrato_depth: maximum of 3 semitones
45
+ dynamics_range: ppp-fff
46
+ silence_padding: >=8ms
47
+ prompt: Melodies, up to two verses per submission, please separate with commas. Monophony encouraged, unless improvisational techniques warrant chord progressions. Examples in EN, DE, ES, FR: {"en": "[0.7, 1, Eb4], 'Mary had a little lamb',[0.9, 1, Ab3,'Twinkle twinkle