Regional prompting / split-area issues in ComfyUI with Anima

#76

by zison - opened Mar 8

Discussion

zison

Mar 8

I’m testing Anima in ComfyUI and ran into problems when trying to do simple regional prompting / split composition.

What I want to do is very basic:

left side = dog
right side = cat
and ideally stop the regional split in the last 20% of sampling

I tried two approaches:

comfyui-prompt-control

using PCLazyTextEncode with regional prompt syntax
this fails with: ValueError: too many values to unpack (expected 2)
the traceback points into comfy/text_encoders/anima.py

ComfyUI core regional conditioning

using Conditioning (Set Area with Percentage) + Conditioning (Combine)
this fails in KSampler with: IndexError: tuple index out of range

So my question is:

Is regional prompting / area conditioning currently supported with Anima in ComfyUI?
Are these known limitations caused by the Anima text encoder / latent format?
If area-based conditioning is not supported, what is the recommended workflow for simple left-right composition or multi-subject layout?

Environment:

ComfyUI 0.11.1
Windows portable build
Python 3.13.9
RTX 4090

Any guidance would be appreciated.

istpk

Mar 9

Using the beta nodes Cond pair set props and Cond Pair Combine don't error, but I haven't been able to get a satisfying result out of it. Maybe with some tinkering something could come of it. Here's a blog post with more info. https://blog.comfy.org/p/masking-and-scheduling-lora-and-model-weights

Espamholding

Mar 9

•

edited Mar 9

Conditioning set mask works just fine for me. (I hope HF doesn't strip metadata?)

Also your comfy seems decently outdated.

synta

Mar 11

Spatial awareness is quite ok with anima. You don't need regional promptiong to tell it to put something left and another thing right. Just use natural langauge. "On the left side of the image..."

Espamholding

Mar 16

•

edited Mar 16

You do need regional prompting in some cases. It minimizes concept bleed which without it can get strong. Nor are regional prompting and specifying directions in the prompt exclusive.

I added spaces after every @ to hopefully not email some random person.

Example 1.1 - natural language-ish only, strong concept bleed

@ ootsuka shin'ichirou, 1girl, misaka mikoto on the left, from above, smug, brown hair, brown eyes, short hair, flower hair ornament, sweater vest, tokiwadai school uniform, emblem, school emblem, white shirt, short sleeves, grey skirt, loose white socks, brown loafers, standing, orange tile floor, grass, holding can of soda, in the top right corner is a sketch chibi inset of misaka mikoto in front of a vending machine, angry, fighting, loose socks, skirt, wind, electricity, greyscale, from behind

It prompt bleeds most of the time, making the mikoto on the left angry, introducing a vending machine outside the corner, adding wind and electricity outside, etc., also due to the artist being an LN artist, it frequently makes the entire right half white.

Repeating to the model what's on the left/top right helps a bit, but I can't see that working super well for every case of concept bleed, like the long white page-like sections.

Example 1.2 - regional prompting, minimal concept bleed

@ ootsuka shin'ichirou, 1girl, misaka mikoto on the left, from above, smug, brown hair, brown eyes, short hair, flower hair ornament, sweater vest, tokiwadai school uniform, emblem, school emblem, white shirt, short sleeves, grey skirt, loose white socks, brown loafers, standing, orange tile floor, grass, holding can of soda

@ ootsuka shin'ichirou, sketch, 1girl, misaka mikoto in the top right, standing in front of vending machine, angry, fighting, inset, loose socks, skirt, wind, electricity, greyscale, from behind

Only concept bleed is that rarely a degenerated vending machine might appear behind the inset. Otherwise, no long white strips, no emotions bleeding, etc.

Example 2.1 - regional prompting styles

1girl, capella emerada lugnica, from below, shrug \(clothing\), from behind, looking up, micro shorts, dark interior, stone wall, large interior, bright blue sky, glowing blue sky, clouds, jungle on top, very tall jungle trees, light rays, vine, moss, grass on top, (horror \(theme\):1.2), dark bottom, dirt bottom, cape

The top part of the image receives anime screenshot and the bottom part sketch, watercolor \(medium\). Generally it just gets closer to what I want, nevermind that I can also separate more concepts like horror theme only for the bottom or more glowing for the top (though I guess I could always just postprocess it on the top, too)

Example 2.2 - natural language-ish, styles dilute

1girl, capella emerada lugnica, from below, shrug \(clothing\), from behind, looking up, micro shorts, dark interior, stone wall, large interior, bright blue sky, glowing blue sky, clouds, jungle on top, very tall jungle trees, light rays, vine, moss, grass on top, anime screenshot on top, (horror \(theme\):1.2), dark bottom, dirt bottom, cape. sketch and watercolor \(medium\) bottom

Most images fail to have the anime screenshot style on the top, some get close to it but aren't quite there. Evidently, the prompting also needs to be specifically ordered to even get that sort of composition, as if you naively specify the styles at the end of the prompt, you get a 2koma.

zison

25 days ago

Conditioning set mask works just fine for me. (I hope HF doesn't strip metadata?)

Also your comfy seems decently outdated.

yea l ike the old GUI

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment