Will prompt adherence include seperate character tagging
I have been testing prompts with characters with different attributes/ characteristics but most of the time it will generate the stereotypes on the wrong character, for instance if you put "1boy with long pink hair, 1girlwith short black hair" the boy will get the short black hair and vice versa. FLUX Klein seems to be able to split the two apart but SDXL will not.
When I add quotes around it seems to help but if you add more than basic prompts it will go back to stereotypes.
Here is the prompt that works:
masterpiece, best quality, score_7, safe,
"1girl with black short hair, small_breasts, blue_dress,"
"1boy with long pink hair, blue shirt, black pants,"
(https://cdn-uploads.huggingface.co/production/uploads/6331731e5102481d8dcf0451/cFJ-cgsjt47ngW4Vf7Eg9.png)
But when you start to add details it will start to mess up the characters attributes.
My hope is that this will be trained further so we can do things like specify characteristics for separate people , position on the canvas (girl on the left), etc...
I think that will really help it be a big improvement over illustrious and noobai models that have nothing close to this level of prompt adherence for anime models.
I really hope you think over my suggestion as I think it will be a huge step forward for anime models. if you need any clarification on my suggestion just feel free to ask I know I am not the best writer.
If anyone has anything to discuss on this topic that would be great.
Thank you.
Yeah, this is a problem similar to this thread https://huggingface.co/circlestone-labs/Anima/discussions/93
I tried myself with both the recommended advices there and official doc, but couldn't make it work no matter what, for exemple:
"masterpiece, best quality, highres, absurdres, anime screenshot, official art, score_7,
beach, upper body, straight-on,
3girls, the image depicts 3 girls at the beach.
left side of the image is cinderella (nikke), blue shirt, green jeans, white hair, red eyes, hair over one eye, smiling, holding microphone, microphone.
middle side of the image is yixuan (zenless zone zero), school uniform, pink skirt, white shirt, white hair, hair ornament, yellow eyes, angry, holding bottle, water bottle.
right side of the image is eri (blue archive), yellow tank top, red short, yellow eyes, white hair, side braid, confused, holding gun, witch hat."
give me this kind of results (I tried both with danbooru tags, NL, danbooru + NL tags)
Once I try with characters around 130 danbooru pics I can't even make them appear anymore even with all their tags when I do 3+ characters
Yeah, this is a problem similar to this thread https://huggingface.co/circlestone-labs/Anima/discussions/93
I tried myself with both the recommended advices there and official doc, but couldn't make it work no matter what, for exemple:
"masterpiece, best quality, highres, absurdres, anime screenshot, official art, score_7,
beach, upper body, straight-on,
3girls, the image depicts 3 girls at the beach.
left side of the image is cinderella (nikke), blue shirt, green jeans, white hair, red eyes, hair over one eye, smiling, holding microphone, microphone.
middle side of the image is yixuan (zenless zone zero), school uniform, pink skirt, white shirt, white hair, hair ornament, yellow eyes, angry, holding bottle, water bottle.
right side of the image is eri (blue archive), yellow tank top, red short, yellow eyes, white hair, side braid, confused, holding gun, witch hat."
give me this kind of results (I tried both with danbooru tags, NL, danbooru + NL tags)Once I try with characters around 130 danbooru pics I can't even make them appear anymore even with all their tags when I do 3+ characters
Some testings that were conducted have shown that linebreaks ruin prompt adherence with model even forgetting who Hatsune Miku is. Newlines are fine in a pure natlang section but whenever tags are invovled it ruins adherence. Good luck, maybe that helps.
Thanks, it did help, it's still bleeding a lot but that's already somewhat better (though for whatever reason he doesn't want to make Eri appear)
I did some tests with 4 characters, and only added at the end:
"carlotta (wuthering waves), grey eyes, white hair, long hair, farthest right side of the image is carlotta, she is behind eri, she is wearing a black cheerleader outfit and holding a white microphone, she is dancing," while changing 3girls to 4girls,
And did another test with with a prompt passed through an LLM to fit qwen 3
At the end of the day this is not Nano Banana. Having 4 characters with each a defined place, alternative dresscode, holding an object, while also look very similar one to another - you are asking too much in my humble opinion if you want it not to bleed. That's the kind of stuff you wanne do with regional prompting, if it all.


