Phr00t
/

Qwen-Image-Edit-Rapid-AIO

Model card Files Files and versions

V22 NSFW Model Feedback: Breakthrough in Human Skin & Text, Yet Some Hurdles Remain

#267

by folkrock - opened Jan 23

Jan 23

Okay, I'll write another review, this time for version 22, the NSFW variation.

The first thing I noticed is the skin. Yes, it has become simply magnificent and seems to have exceeded my expectations. It's better than in version 18.1, which was previously my favorite for character rendering. Secondly, and I discovered this completely by accident, this is the first version of your model that can write in Russian. Any previous versions just butchered Cyrillic letters or words, but now I can generate Russian text or Cyrillic letters just as well as in the standard Qwen Image.

Now, about what hasn't quite worked for me yet or what I've had to patch with workarounds. Increased saturation and contrast. Yes, for some reason they've skyrocketed. But, of course, this isn't critical. I tried multiplying the latent by a value <1 — that brings it closer to normal saturation/contrast, but in that case, the model starts hallucinating more, which isn't good. I also tried gradually lowering the cfg norm, down to 0.9 — that also reduces the hypertrophied saturation/contrast, however, character consistency becomes worse. For now, I've settled on decoding the latent and correcting the result with additional post-processing nodes, but so far, this is precisely a workaround. I don't understand what solution to apply to avoid hyper-contrast at the latent stage.

And the second point is character consistency. Unfortunately, here I'm also starting to hit a wall where the consistence_edit LoRA is needed. I wouldn't pay attention to it if applying any LoRA didn't produce a pixelated pattern all over the image. I managed to partially get rid of it using your node. I looked at how it works and made a slightly expanded version (haven't published it, at least not yet), which can also reduce the pattern. However, applying almost any LoRA, including consistence_edit with a strength above 0.5-0.6, brings the pattern back, which is sad.

Those are my impressions so far, and at the moment, I don't know how to tackle these problems. But I can say for sure that the development direction chosen for your model is the right one, and this is a huge step forward.

P.S. Translated by the same AI, please put away the slipper you prepared to throw at me for my poor English.

Owner Jan 23

I need concrete examples to add to my test cases. Make sure they do not violate community guidelines.

I do not want to include the "consistency" LORA. I've tried it before. Sometimes, you don't want consistency, like prompting a completely different position, location or are even not doing an edit at all. The consistency LORA fights against all of that.

Jan 23

•

Sometimes, you don't want consistency

Yes, actually, what's almost always needed is not to replicate the input image, but to position the character differently, change the angle, or both (the most common case).

I'm attaching an examples: the input image of a girl (appearance is generated), the generation in version 22 without post-processing (noticeable over-contrast and elevated saturation), the face has some resemblance but is further from the original than in the generation with 18.1 (third image), and as a fourth image — the settings in the K-sampler.

Jan 23

Face swap with base qwen image edit 2511 with 4 steps lightning.

Image edit with base qwen image edit 2511 with 4 steps lightning.

I think this is just a problem of the base model.

Jan 23

•

Face swap with base qwen image edit 2511 with 4 steps lightning.

Image edit with base qwen image edit 2511 with 4 steps lightning.

I think this is just a problem of the base model.

The FaceSwap LoRA, unfortunately, forces setting strength to 1, but in that case, the model focuses entirely on the face and almost completely ignores the prompt. This forces a two-step generation process: the first pass is to position the character in a new environment, with new lighting, in a new pose and angle, and the second pass is to correct the face. Alternatively, reducing its strength is an option, but then it's a lottery: the face might resemble the original or it might not—it's a matter of luck.

I've noticed that in version 18.1, the face handling was fine, and changes in appearance were quite rare. I'm not saying now that version 22 completely and 100% changes the face, but this phenomenon is frequent, and distortions of facial features occur much more often than in version 18.

Most likely, this—like the contrast issue—is the result of one of the LoRAs or a combination of several. I don't know which ones are applied here or at what strength, but my personal thought is that if we experiment with the LoRA strengths, these problems could be reduced.

Owner Jan 24

•

I was struggling with likeness with this particular woman just as you were. The base model couldn't seem to handle it. I did a few things that helped a little, like cropping her face more and adding in the rest of the "BestFaceSwap" LORA, bringing this the best I could do with my latest v23 SFW merge:

I think it is her particularly long and skinny head shape, perhaps?

Other faces look better, even without any additional face swap (also v23 SFW):

(note that the crop on the left is much higher resolution than the face on the jogging picture, hence expected loss of detail)

Jan 24

•

I think it is her particularly long and skinny head shape, perhaps?

Frankly, I'm at a loss for words. I tried generating other images of her in different styles, with different lighting, slightly varied angles and emotions, and even in different clothing, but using version 18.1, and here's the result. This isn't a complaint—I truly respect your work and consider this model the most successful one. I'm simply at a dead end right now and don't understand how to achieve the same level of facial consistency as in version 18, but on the more advanced version 22 in all other respects.

UPD: I just noticed version 23 is already out. Okay, I'll take a day or two to test it and write my feedback. Thank you for your hard work.

Jan 24

•

At this point, I can say that version 23, at first glance, produces fewer facial discrepancies than version 22 and generally handles character positioning and scaling relative to environmental objects better than version 18. However, both versions failed with the pose. Although, of course, I might have just gotten (un)lucky with the seed, but this time the hyper-contrast manifested in version 18, not in 22 or 23.

In version 18, the face is okay. In version 22, the face is somewhat flattened, unpleasantly so. Version 23 — the generation is almost identical to version 22's, but the face is normal.

The prompt from the screenshot: Completely and entirely remove the man from picture 2; in his place should be the woman from picture 1. The woman must fit into picture 2 as naturally as possible, considering the lighting and tones of the new environment. Be sure to preserve all the characteristics of the woman: facial features, body shape, skin color and texture.

Attached images in order: 18.1, 22, 23, screenshot of the main nodes.

Owner Jan 24

I'd say v23 looks the best, although not that much different than v22. v18 tried to just copy her exactly from the original image, which improves consistency but just looks unnatural (also her hand is tiny).

I've heard people messing with the "AuroaFlow" model node, trying a value of 3.1. I feel like I've messed with it and haven't had much luck, but maybe its worth messing with in certain situations?

Jan 25

•

I feel like I've messed with it and haven't had much luck, but maybe its worth messing with in certain situations?

Actually, I never really understood what benefits changing the AuraFlow sampling offers. I tried adjusting the values one way or another right at the beginning, when I was just getting familiar with the Qwen Image Edit model, but I couldn't figure out how it might affect prompt adherence.

Instead, I fixed the seed and delved into fine-tuning the prompt. Luckily, Qwen 2.5 VL understands my native language, so there's no need to write in broken English or go to an LLM for translations (which was terribly annoying in SD/Flux and slowed down the process).

But a correct and precise prompt really does work wonders—I've attached an example (replacing a tractor driver with a generated girl in a very specific pose). And, characteristically, version 23 follows the prompt very well; I encountered much less resistance from it than in versions 18 and 19.

Anyway, I deleted all the others, freeing up about 100 GB on my SSD :) This really is a powerful version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment