SDXL TextEncoder 1 & 2 #918
Answered
by
kohya-ss
caniyabanci76
asked this question in
Q&A
-
Does TE1 = CLIP G and TE2 = CLIP L ? |
Beta Was this translation helpful? Give feedback.
Answered by
kohya-ss
Oct 31, 2023
Replies: 1 comment
-
Sorry for lacking of the documentation. Text Encoder 1 = ViT-L (768 dims) and Text Encoder 2 = BiG-G (1280 dims). This is because SDXL state dict has |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
caniyabanci76
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sorry for lacking of the documentation. Text Encoder 1 = ViT-L (768 dims) and Text Encoder 2 = BiG-G (1280 dims).
This is because SDXL state dict has
conditioner.embedders.0
keys for ViT-L andconditioner.embedders.1
keys for BiG-G.