Huggingface demo is bad?
I'm getting very bad and low detailed results from the huggingface demo. None of the outputs looks good as the images posted presentation of model homepage.
The images in the GitHub page are from HiDream-ai/HiDream-O1-Image model and the huggingface space is for Dev model. Moreover, the huggingface space does not use flash-attn, which may cause different results. We are preparing new huggingface spaces and should be ready today.
Please Add Image Edit Examples they are missing
I tested HiDream-ai/HiDream-O1-Image at home yesterday on an RTX 3090 using the scripts provided in GitHub and tuning the parameters for image generation in different ways.
THE GOOD
The fact that it is a single model makes it a lot easier to use with respect to conventional stable diffusion pipelines.
It also seems fast and robust, even on consumer-grade hardware.
Prompt adherence is probably better than any open-source model I have tried thus far, which means you can find the specific composition you wish with lesser iterations.
It produces high resolution images with modest resources.
THE BAD
However, image quality is not at par with the best open source models. There is a general lack of realism and colour oversaturation. The latter can be avoided by tweaking some parameters, but the lack of photorealism remains. In addition, faces and anatomy are still an issue on this model to a greater extent than with the top open-source competitors. It is true that most pro models also struggle with "crowds", but this one is not SOTA on this chapter.
Hopefully, community LORAs will palliate these issues.
I suspect that the Arena results were obtained with a much bigger pro model (Peanut) and not with the open-sourced one.
THE BOTTOM LINE
In summary, great progress, promising architecture, but not quite there yet.