Qwen3-4B Deforum Prompt LoRA v3
Status: Experimental / Alpha β Known quality issues. See Known Issues.
A QLoRA fine-tune of Qwen3-4B-Instruct-2507 for generating cinematic video diffusion prompts in the De Forum Art Film aesthetic. Intended for use with ComfyUI, Deforum, LTX-Video, and WanVideo pipelines.
Trained on deforum-prompt-lora-dataset-v3.1 (2,831 rows, cleaned subset of v2 data).
Known Issues
This model produces repetitive, verbose, and sometimes incoherent output.
Root cause is dataset quality, not training configuration. The v3.1 dataset inherits broken template patterns from earlier dataset versions:
| Issue | Description |
|---|---|
| Looping output | Same phrases repeat 3+ times within a single generation |
| Meta-text leakage | Outputs contain "Certainly. Here's...", "Here's a streamlined version..." |
| Technical Parameters | Negative prompt lists bleed through from training data |
| Not cinematic | Produces prose descriptions instead of camera/lighting/mood prompts |
| Limited vocabulary | Over-relies on "chiaroscuro", "contemplative", "stark against" |
Training metrics look good (eval_loss 0.075, 97.6% accuracy) β this means the model has perfectly memorized the flawed training data.
Next step: v4 dataset in development using Ollama synthesis from diverse source material to generate all training responses from scratch.
Intended Use
Recommended: Use via Ollama with a custom Modelfile that constrains output format and adds stop tokens for meta-text.
Not recommended: Raw transformers pipeline β the adapter requires careful system prompting + stop tokens to produce usable output.
Usage with Ollama
# After GGUF conversion, create a Modelfile:
cat > Modelfile.deforum-v3 << 'MODELFILE'
FROM ./qwen3-4b-deforum-q8-v3.1.gguf
SYSTEM """You are a cinematic video prompt generator specializing in the De Forum Art Film aesthetic.
Generate prompts with: camera movement, subject, lighting, mood.
40-80 words. NO meta-text. NO Technical Parameters. NO negative prompts."""
PARAMETER temperature 0.7
PARAMETER top_p 0.85
PARAMETER num_predict 100
PARAMETER repeat_penalty 1.5
PARAMETER stop "<|im_end|>"
PARAMETER stop "Technical Parameters:"
PARAMETER stop "Certainly."
MODELFILE
# Create and run
ollama create qwen3-deforum-v3 -f Modelfile.deforum-v3
ollama run qwen3-deforum-v3 "Scene 1: Sarah alone in her studio, slow push-in on face, noir contemplative"
Usage with Transformers (not recommended)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")
model = PeftModel.from_pretrained(base_model, "Limbicnation/qwen3-4b-deforum-prompt-lora-v3")
tokenizer = AutoTokenizer.from_pretrained("Limbicnation/qwen3-4b-deforum-prompt-lora-v3")
messages = [
{"role": "system", "content": "You are a cinematic video prompt generator specializing in the De Forum Art Film aesthetic."},
{"role": "user", "content": "Generate a cinematic prompt: rain-soaked alleyway at night, slow tracking shot, noir atmosphere"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, repetition_penalty=1.5)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Configuration
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-4B-Instruct-2507 |
| Method | SFT (Supervised Fine-Tuning) via TRL |
| Quantization | QLoRA β NF4 4-bit, bf16 compute |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| LoRA targets | q_proj, k_proj, v_proj, o_proj |
| Learning rate | 2e-4 (cosine schedule) |
| Warmup | 3% of steps |
| Epochs | 3 |
| Batch size | 2 per device x 8 gradient accumulation = 16 effective |
| Sequence length | 512 tokens |
| Packing | Enabled |
| Optimizer | paged_adamw_8bit |
| Precision | bf16 |
| Gradient checkpointing | Enabled |
Training Data
| Dataset | Rows | Description |
|---|---|---|
| deforum-prompt-lora-dataset-v3.1 | 2,831 | Cleaned v2 data (Sarah/chiaroscuro diversified), 3 tiers: short, medium, detailed |
Data lineage: v1 (4,860 rows, single-narrative scene contexts) -> v2 (reformatted with verbose templates, 191-265 words) -> v3.1 (cleaned v2, replaced repetitive patterns). Despite cleaning, v3.1 responses still inherit the broken template structures from v2.
Evaluation Results
| Metric | Value |
|---|---|
| Final eval_loss | 0.0755 |
| Final train_loss | 0.318 |
| Final token accuracy | 97.6% |
| Training time | ~24 min (1428s) |
| Training steps | ~414 (3 epochs) |
Training curve highlights:
| Epoch | Train Loss | Eval Loss | Token Accuracy |
|---|---|---|---|
| 0.18 | 1.19 | 1.112 | 79.2% |
| 0.36 | 0.42 | 0.468 | 90.2% |
| 0.72 | 0.14 | 0.148 | 96.2% |
| 1.09 | 0.10 | 0.103 | 97.0% |
| 1.45 | 0.08 | 0.088 | 97.4% |
| 1.99 | 0.08 | 0.080 | 97.5% |
| 2.54 | 0.07 | 0.076 | 97.6% |
| 2.90 | 0.07 | 0.076 | 97.6% |
Early stopping was configured (patience=3, eval every 25 steps) but did not trigger β eval loss plateaued around epoch 2 but continued to decrease marginally, indicating the model had capacity to fully fit the training data.
Hardware
- GPU: NVIDIA RTX 4090 (24 GB VRAM)
- Training time: ~24 minutes
- Steps/second: 0.29
Monitoring
Training was tracked with Weights & Biases: View W&B Run (hld24rpy)
Framework Versions
| Component | Version |
|---|---|
| TRL | 0.27.1 |
| Transformers | 4.57.6 |
| PyTorch | 2.6.0+cu124 |
| PEFT | 0.18.1 |
| Datasets | 4.5.0 |
| Tokenizers | 0.22.2 |
Limitations
- Dataset-driven memorization β The high accuracy reflects memorization of flawed patterns, not genuine cinematic prompt generation capability
- Repetitive output β Generations frequently loop the same phrases due to repetitive training data
- Meta-text contamination β The model outputs conversational meta-text ("Certainly", "Here's") and negative prompt lists that were present in training data
- Narrow vocabulary β Over-reliance on a small set of lighting/mood descriptors inherited from the single-source v1 dataset
- Requires post-processing β Must use aggressive
repeat_penalty(1.3-1.5) and stop tokens to get usable output
Model Card Contact
- Author: Limbicnation
- Repository: prompt-lora-trainer
Citation
@misc{limbicnation2026deforum,
title = {Qwen3-4B Deforum Prompt LoRA v3},
author = {Limbicnation},
year = {2026},
url = {https://huggingface.co/Limbicnation/qwen3-4b-deforum-prompt-lora-v3}
}
Model tree for Limbicnation/qwen3-4b-deforum-prompt-lora-v3
Base model
Qwen/Qwen3-4B-Instruct-2507