Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OmniGen #10148

Merged
merged 58 commits into from
Feb 11, 2025
Merged

Add OmniGen #10148

merged 58 commits into from
Feb 11, 2025

Conversation

staoxiao
Copy link
Contributor

@staoxiao staoxiao commented Dec 8, 2024

What does this PR do?

Add a new pipeline along with corresponding tests and documentation.

Fixes #9873

Before submitting

@sayakpaul

@hlky
Copy link
Collaborator

hlky commented Dec 8, 2024

Hi @staoxiao! Thanks for your contribution 🤗 Could you run make style?

cc @stevhliu can you take a look at the docs?

cc @asomoza can you help test the pipelines?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@staoxiao
Copy link
Contributor Author

staoxiao commented Dec 8, 2024

Hi, @hlky , I ran 'make style' and modified my code, but it seems there are some errors in the other original files that I haven't changed.
image

@hlky
Copy link
Collaborator

hlky commented Dec 8, 2024

@staoxiao This is due to ruff version, try pip install ruff==0.1.5 or pip install -e ".[dev]" in Diffusers

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing such a cool pipeline and complete docs!

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks for your PR!
super excited to have this in diffusers. My main feedbacks is that we cannot import the Phi3Model from transformers and use it as a block in diffusers. I left some comment on how to rewrite and fit into diffusers code. Let us know if you need any help!

Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @staoxiao! Apologies for the delay in reviews

Just some nits and refactors required to follow latest design choices in diffusers. A good reference would be the HunyuanVideo PR: #10136

One other remaining requirement is the minimal transformer modeling test. This would be a good example. Happy to help make any of the changes 🤗

@staoxiao
Copy link
Contributor Author

staoxiao commented Feb 8, 2025

Thanks for all suggestions and I have updated the code!

@nitinmukesh
Copy link

nitinmukesh commented Feb 9, 2025

Working good so far. Just observed 1 issue.

With use_input_image_size_as_output as True

In Diffuser version [EDIT: checked HF space the issue is there as well], if I don't select an image and do prompt infer only, it throws an error. Solution is there (in case someone else face this problem NoneType), add use_input_image_size_as_output conditionally

if input_images is not None and len(input_images) > 0:
    inference_params["use_input_image_size_as_output"] = use_input_image_size_as_output

zhang

56087529

The woman in <|image_1|> and boy in <|image_2|> are holding hand and walking on street.
20250209_160150_157648_omnigen

@staoxiao
Copy link
Contributor Author

staoxiao commented Feb 9, 2025

@nitinmukesh, use_input_image_size_as_output means setting the output image size to be consistent with the input image size. When using text-to-image generation, there is no need to set this parameter. To avoid this issue, I have added a check for this parameter in the check_inputs function.

@nitinmukesh
Copy link

Thank you @staoxiao .
I will modify at my end and set it to False for text2image generation.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu
Copy link
Collaborator

@staoxiao can you run make style and make fix-copies? so that the quality test would pass on our CI

@a-r-r-o-w, can you take a look again to see if all your comments are addressed? we can merge after that:)

Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @staoxiao for addressing the reviews! The PR looks good to merge. There are some things that we could refactor further to make consistent with diffusers-style implementation, but it is mostly just nitpicking -- we can do a follow-up to address this.

Really sorry for the long wait!

There seems to be some tests that are failing. Could you look at them? Happy to help fix them if you're busy

failing tests

https://github.com/huggingface/diffusers/actions/runs/13254834397/job/37014475117?pr=10148

FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_callback_cfg - AttributeError: 'OmniGenPipeline' object has no attribute 'num_timesteps'
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_callback_inputs - assert tensor(18.2633) == 0
 +  where tensor(18.2633) = <built-in method sum of Tensor object at 0x7fed4b365490>()
 +    where <built-in method sum of Tensor object at 0x7fed4b365490> = tensor([[[[0.2224, 1.5052],\n          [1.5705, 0.7162]],\n\n         [[0.5437, 0.5603],\n          [0.5271, 1.9422]],\n\n         [[1.4330, 0.6505],\n          [0.3135, 0.6654]],\n\n         [[1.2489, 2.0913],\n          [3.4383, 0.8350]]]]).sum
 +      where tensor([[[[0.2224, 1.5052],\n          [1.5705, 0.7162]],\n\n         [[0.5437, 0.5603],\n          [0.5271, 1.9422]],\n\n         [[1.4330, 0.6505],\n          [0.3135, 0.6654]],\n\n         [[1.2489, 2.0913],\n          [3.4383, 0.8350]]]]) = <built-in method abs of Tensor object at 0x7fed4b364540>()
 +        where <built-in method abs of Tensor object at 0x7fed4b364540> = tensor([[[[-0.2224, -1.5052],\n          [ 1.5705,  0.7162]],\n\n         [[ 0.5437, -0.5603],\n          [ 0.5271,  1.9422]],\n\n         [[-1.4330,  0.6505],\n          [ 0.3135, -0.6654]],\n\n         [[ 1.2489,  2.0913],\n          [-3.4383, -0.8350]]]]).abs
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_inference_batch_single_identical - Failed: Timeout >60.0s
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_loading_with_variants - OSError: Error no file named diffusion_pytorch_model.fp16.bin found in directory /tmp/tmp1u_zu527/transformer.
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_num_images_per_prompt - RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 4 for tensor number 1 in the list.
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_save_load_dduf - Failed: Timeout >60.0s
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_save_load_local - OSError: Error no file named diffusion_pytorch_model.bin found in directory /tmp/tmpwmjdsdca/transformer.
FAILED tests/pipelines/omnigen/test_pipeline_omnigen.py::OmniGenPipelineFastTests::test_save_load_optional_components - OSError: Error no file named diffusion_pytorch_model.bin found in directory /tmp/tmpxrfny9bf/transformer.

@staoxiao
Copy link
Contributor Author

Hi, @a-r-r-o-w , I have fixed some of the failing tests. However, the remaining failing tests need your assistance to resolve, for example, OmniGen does not have an fp16 version for the function test_loading_with_variants(OmniGen cannot support fp16).

@a-r-r-o-w
Copy link
Member

@staoxiao Seems like that might be a local environment error, as it is passing for my environment and on our CI.

Merging since the currently failing tests look unrelated. Thanks Shitao! :)

@a-r-r-o-w a-r-r-o-w merged commit 798e171 into huggingface:main Feb 11, 2025
10 of 12 checks passed
@a-r-r-o-w a-r-r-o-w mentioned this pull request Feb 11, 2025
@nitinmukesh
Copy link

Thank you @staoxiao and @a-r-r-o-w .

@tin2tin
Copy link

tin2tin commented Feb 12, 2025

Yes, thank you @staoxiao and @a-r-r-o-w I've quickly added OmniGen to a free Blender add-on, I'm sometimes working on. And posted about it on Reddit. I hope it's okay, and it'll give your hard work some exposure. Here's the video I did in full res:

OmniGen.mp4

@sayakpaul
Copy link
Member

Keep doing it @tin2tin! We appreciate the good work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
close-to-merge roadmap Add to current release roadmap
Projects
Development

Successfully merging this pull request may close these issues.

Add OmniGen: A Unified Image Generation Model Pipeline
10 participants