suriyagunasekar
/

midjourney-prompts-v2

Model card Files Files and versions

Midjourney Text-to-Image Prompts Dataset

This dataset contains cleaned and extracted Midjourney text prompts extracted from Discord messages.

Dataset Structure

The dataset contains three splits:

train: 53,183 unique prompts (80%)
validation: 6,648 unique prompts (10%)
test: 6,648 unique prompts (10%)

Data Format

Each line is a JSON object with a single field:

text: The Midjourney prompt text

Extraction Logic

Prompts were extracted from Discord messages using the following process:

Filtered for message types 0 (INITIAL_OR_VARIATION) and 19 (UPSCALE)
Extracted text between double asterisks (**)
Removed embedded image URLs (e.g., https://s.mj.run/...)
Removed duplicates to ensure unique prompts

Use Case

This dataset is suitable for fine-tuning language models on prompt generation or other NLP tasks related to text-to-image prompts.

Downloads last month: -; Downloads are not tracked for this model. How to track