Skip to content

Commit fbf8f48

Browse files
voletivVikram Voleti
and
Vikram Voleti
authored
Adds controlnet images, updates README (#21)
* Adds blur and depth images * Cosmetic changes to REEADME --------- Co-authored-by: Vikram Voleti <[email protected]>
1 parent 6b8be52 commit fbf8f48

File tree

3 files changed

+17
-16
lines changed

3 files changed

+17
-16
lines changed

Diff for: README.md

+17-16
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@ Contains code for the text encoders (OpenAI CLIP-L/14, OpenCLIP bigG, Google T5-
66

77
Note: this repo is a reference library meant to assist partner organizations in implementing SD3.5/SD3. For alternate inference, use [Comfy](https://github.com/comfyanonymous/ComfyUI).
88

9-
### Updates
9+
## Updates
1010

11+
- Nov 26, 2024 : Released ControlNets for SD3.5-Large.
1112
- Oct 29, 2024 : Released inference code for SD3.5-Medium.
1213
- Oct 24, 2024 : Updated code license to MIT License.
1314
- Oct 22, 2024 : Released inference code for SD3.5-Large, Large-Turbo. Also works on SD3-Medium.
1415

15-
### Download
16+
## Download
1617

1718
Download the following models from HuggingFace into `models` directory:
1819
1. [Stability AI SD3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/sd3.5_large.safetensors) or [Stability AI SD3.5 Large Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo/blob/main/sd3.5_large_turbo.safetensors) or [Stability AI SD3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/sd3.5_medium.safetensors)
@@ -22,12 +23,12 @@ Download the following models from HuggingFace into `models` directory:
2223

2324
This code also works for [Stability AI SD3 Medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/sd3_medium.safetensors).
2425

25-
#### ControlNets
26+
### ControlNets
2627

2728
Optionally, download [SD3.5 Large ControlNets](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets):
28-
(a) [Blur ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/blur_8b.safetensors)
29-
(b) [Canny ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/canny_8b.safetensors)
30-
(c) [Depth ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/depth_8b.safetensors)
29+
- [Blur ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/blur_8b.safetensors)
30+
- [Canny ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/canny_8b.safetensors)
31+
- [Depth ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets/resolve/main/depth_8b.safetensors)
3132

3233
```py
3334
from huggingface_hub import hf_hub_download
@@ -36,7 +37,7 @@ hf_hub_download("stabilityai/stable-diffusion-3.5-controlnets", "sd3.5_large_con
3637
hf_hub_download("stabilityai/stable-diffusion-3.5-controlnets", "sd3.5_large_controlnet_depth.safetensors", local_dir="models")
3738
```
3839

39-
### Install
40+
## Install
4041

4142
```sh
4243
# Note: on windows use "python" not "python3"
@@ -46,7 +47,7 @@ source .sd3.5/bin/activate
4647
python3 -s -m pip install -r requirements.txt
4748
```
4849

49-
### Run
50+
## Run
5051

5152
```sh
5253
# Generate a cat using SD3.5 Large model (at models/sd3.5_large.safetensors) with its default settings
@@ -74,25 +75,25 @@ Optionally, use [Skip Layer Guidance](https://github.com/comfyanonymous/ComfyUI/
7475
python3 sd3_infer.py --prompt path/to/my_prompts.txt --model models/sd3.5_medium.safetensors --skip_layer_cfg True
7576
```
7677

77-
#### ControlNets
78+
### ControlNets
7879

7980
To use SD3.5 Large ControlNets, additionally download your chosen ControlNet model from the [model repository](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets), then run inference, like so:
80-
(a) Blur:
81+
- Blur:
8182
```sh
8283
python sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_blur.safetensors --controlnet_cond_image inputs/blur.png --prompt "generated ai art, a tiny, lost rubber ducky in an action shot close-up, surfing the humongous waves, inside the tube, in the style of Kelly Slater"
8384
```
84-
(b) Canny:
85+
- Canny:
8586
```sh
8687
python sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_canny.safetensors --controlnet_cond_image inputs/canny.png --prompt "A Night time photo taken by Leica M11, portrait of a Japanese woman in a kimono, looking at the camera, Cherry blossoms"
8788
```
88-
(c) Depth:
89+
- Depth:
8990
```sh
9091
python sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_depth.safetensors --controlnet_cond_image inputs/depth.png --prompt "photo of woman, presumably in her mid-thirties, striking a balanced yoga pose on a rocky outcrop during dusk or dawn. She wears a light gray t-shirt and dark leggings. Her pose is dynamic, with one leg extended backward and the other bent at the knee, holding the moon close to her hand."
9192
```
9293

9394
For details on preprocessing for each of the ControlNets, and examples, please review the [model card](https://huggingface.co/stabilityai/stable-diffusion-3.5-controlnets).
9495

95-
### File Guide
96+
## File Guide
9697

9798
- `sd3_infer.py` - entry point, review this for basic usage of diffusion model
9899
- `sd3_impls.py` - contains the wrapper around the MMDiTX and the VAE
@@ -104,7 +105,7 @@ For details on preprocessing for each of the ControlNets, and examples, please r
104105
- `t5xxl.safetensors` (google T5-v1.1-XXL, can grab a public copy)
105106
- `sd3.5_large.safetensors` or `sd3.5_large_turbo.safetensors` or `sd3.5_medium.safetensors` (or `sd3_medium.safetensors`)
106107

107-
### Code Origin
108+
## Code Origin
108109

109110
The code included here originates from:
110111
- Stability AI internal research code repository (MM-DiT)
@@ -113,10 +114,10 @@ The code included here originates from:
113114
- Some code from ComfyUI internal Stability implementation of SD3 (for some code corrections and handlers)
114115
- HuggingFace and upstream providers (for sections of CLIP/T5 code)
115116

116-
### Legal
117+
## Legal
117118

118119
Check the LICENSE-CODE file.
119120

120-
#### Note
121+
### Note
121122

122123
Some code in `other_impls` originates from HuggingFace and is subject to [the HuggingFace Transformers Apache2 License](https://github.com/huggingface/transformers/blob/main/LICENSE)

Diff for: inputs/blur.png

538 KB
Loading

Diff for: inputs/depth.png

157 KB
Loading

0 commit comments

Comments
 (0)