Feature request: Switch off Unet for DiT #11

moiseshorta · 2025-02-06T13:20:38Z

Hello,

I've been reading a lot of the SOTA papers on audio and video generation using Rectified Flows, and it seems most are using Transformers instead of Unets.

Are there any plans to implement such an architecture change? They seem to improve greatly in performance, as in this implementation: https://github.com/cloneofsimo/minRF

Would be great to see it here, as it's a very clear to understand codebase, thanks again for opensourcing it!

lucidrains · 2025-02-12T15:25:20Z

@moiseshorta yes you are correct, pure attention has basically completely taken over

why not use Simo's implementation instead of the one here? is there anything lacking in his? he's a pretty amazing guy in general

moiseshorta · 2025-02-13T17:34:11Z

As I mentioned, your implementation seems a bit more clear to me than Simo's. Although I will give his a try as well :)

lucidrains · 2025-02-13T17:38:13Z

ohh, ok, I've seen him live code before and was very impressed. but yeah sure i can add DiT here, or perhaps just make it x-transformer compatible

moiseshorta · 2025-02-14T10:31:57Z

yeah, that would be great. I'm currently trying to scale up your implementation but I seem to be either very quickly overfitting or just NaN/Inf gradients while training on a bigger dataset...

lucidrains · 2025-02-14T16:50:21Z

@moiseshorta overfitting vs divergence are two very different things, and there are tricks of the trade for both of those issues.

sounds good, give me some time, too many projects but i do want to flesh out this repo a bit more, or dog food other libraries

moiseshorta · 2025-02-14T23:35:16Z

Yes, definitely. On some occasions there is overfitting, tried mitigating with dropout. On some occasions the divergence has been mitigated with lower LR. Any other advice is greatly appreciated! Looking forward to the implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Switch off Unet for DiT #11

Feature request: Switch off Unet for DiT #11

moiseshorta commented Feb 6, 2025

lucidrains commented Feb 12, 2025

moiseshorta commented Feb 13, 2025

lucidrains commented Feb 13, 2025

moiseshorta commented Feb 14, 2025

lucidrains commented Feb 14, 2025

moiseshorta commented Feb 14, 2025

Feature request: Switch off Unet for DiT #11

Feature request: Switch off Unet for DiT #11

Comments

moiseshorta commented Feb 6, 2025

lucidrains commented Feb 12, 2025

moiseshorta commented Feb 13, 2025

lucidrains commented Feb 13, 2025

moiseshorta commented Feb 14, 2025

lucidrains commented Feb 14, 2025

moiseshorta commented Feb 14, 2025