nyx-ai
diff --git a/‎docs/source/en/api/pipelines/amused.md
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/pipelines/amused.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/pipelines/kandinsky3.md
Lines changed: 2 additions & 2 deletions b/‎docs/source/en/api/pipelines/kandinsky3.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/en/api/pipelines/ledits_pp.md
Lines changed: 3 additions & 3 deletions b/‎docs/source/en/api/pipelines/ledits_pp.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/source/en/api/pipelines/marigold.md
Lines changed: 10 additions & 10 deletions b/‎docs/source/en/api/pipelines/marigold.md
Lines changed: 10 additions & 10 deletions
diff --git a/‎docs/source/en/api/pipelines/pixart.md
Lines changed: 4 additions & 5 deletions b/‎docs/source/en/api/pipelines/pixart.md
Lines changed: 4 additions & 5 deletions
diff --git a/‎docs/source/en/api/pipelines/pixart_sigma.md
Lines changed: 4 additions & 6 deletions b/‎docs/source/en/api/pipelines/pixart_sigma.md
Lines changed: 4 additions & 6 deletions
diff --git a/‎docs/source/en/api/pipelines/stable_diffusion/overview.md
Lines changed: 2 additions & 2 deletions b/‎docs/source/en/api/pipelines/stable_diffusion/overview.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/en/api/schedulers/edm_multistep_dpm_solver.md
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/schedulers/edm_multistep_dpm_solver.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/api/schedulers/multistep_dpm_solver.md
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/api/schedulers/multistep_dpm_solver.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/en/optimization/deepcache.md
Lines changed: 1 addition & 1 deletion b/‎docs/source/en/optimization/deepcache.md
Lines changed: 1 addition & 1 deletion
@@ -16,7 +16,7 @@ aMUSEd was introduced in [aMUSEd: An Open MUSE Reproduction](https://huggingface
 
 Amused is a lightweight text to image model based off of the [MUSE](https://arxiv.org/abs/2301.00704) architecture. Amused is particularly useful in applications that require a lightweight and fast model such as generating many images quickly at once.
 
-Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes. 
+Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes.
 
 The abstract from the paper is:
 
 
@@ -11,12 +11,12 @@ specific language governing permissions and limitations under the License.
 
 Kandinsky 3 is created by [Vladimir Arkhipkin](https://github.com/oriBetelgeuse),[Anastasia Maltseva](https://github.com/NastyaMittseva),[Igor Pavlov](https://github.com/boomb0om),[Andrei Filatov](https://github.com/anvilarth),[Arseniy Shakhmatov](https://github.com/cene555),[Andrey Kuznetsov](https://github.com/kuznetsoffandrey),[Denis Dimitrov](https://github.com/denndimitrov), [Zein Shaheen](https://github.com/zeinsh)
 
-The description from it's Github page: 
+The description from it's Github page:
 
 *Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.*
 
 Its architecture includes 3 main components:
-1. [FLAN-UL2](https://huggingface.co/google/flan-ul2), which is an encoder decoder model based on the T5 architecture. 
+1. [FLAN-UL2](https://huggingface.co/google/flan-ul2), which is an encoder decoder model based on the T5 architecture.
 2. New U-Net architecture featuring BigGAN-deep blocks doubles depth while maintaining the same number of parameters.
 3. Sber-MoVQGAN is a decoder proven to have superior results in image restoration.
 
 
@@ -25,11 +25,11 @@ You can find additional information about LEDITS++ on the [project page](https:/
 </Tip>
 
 <Tip warning={true}>
-Due to some backward compatability issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion. 
-This issue is unlikely to have any noticeable effects on applied use-cases. However, we provide an alternative implementation that guarantees perfect inversion in a dedicated [GitHub repo](https://github.com/ml-research/ledits_pp). 
+Due to some backward compatability issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion.
+This issue is unlikely to have any noticeable effects on applied use-cases. However, we provide an alternative implementation that guarantees perfect inversion in a dedicated [GitHub repo](https://github.com/ml-research/ledits_pp).
 </Tip>
 
-We provide two distinct pipelines based on different pre-trained models. 
+We provide two distinct pipelines based on different pre-trained models.
 
 ## LEditsPPPipelineStableDiffusion
 [[autodoc]] pipelines.ledits_pp.LEditsPPPipelineStableDiffusion
 
@@ -14,10 +14,10 @@ specific language governing permissions and limitations under the License.
 
 ![marigold](https://marigoldmonodepth.github.io/images/teaser_collage_compressed.jpg)
 
-Marigold was proposed in [Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation](https://huggingface.co/papers/2312.02145), a CVPR 2024 Oral paper by [Bingxin Ke](http://www.kebingxin.com/), [Anton Obukhov](https://www.obukhov.ai/), [Shengyu Huang](https://shengyuh.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Rodrigo Caye Daudt](https://rcdaudt.github.io/), and [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ&hl=en). 
-The idea is to repurpose the rich generative prior of Text-to-Image Latent Diffusion Models (LDMs) for traditional computer vision tasks. 
-Initially, this idea was explored to fine-tune Stable Diffusion for Monocular Depth Estimation, as shown in the teaser above. 
-Later, 
+Marigold was proposed in [Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation](https://huggingface.co/papers/2312.02145), a CVPR 2024 Oral paper by [Bingxin Ke](http://www.kebingxin.com/), [Anton Obukhov](https://www.obukhov.ai/), [Shengyu Huang](https://shengyuh.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Rodrigo Caye Daudt](https://rcdaudt.github.io/), and [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ&hl=en).
+The idea is to repurpose the rich generative prior of Text-to-Image Latent Diffusion Models (LDMs) for traditional computer vision tasks.
+Initially, this idea was explored to fine-tune Stable Diffusion for Monocular Depth Estimation, as shown in the teaser above.
+Later,
 - [Tianfu Wang](https://tianfwang.github.io/) trained the first Latent Consistency Model (LCM) of Marigold, which unlocked fast single-step inference;
 - [Kevin Qu](https://www.linkedin.com/in/kevin-qu-b3417621b/?locale=en_US) extended the approach to Surface Normals Estimation;
 - [Anton Obukhov](https://www.obukhov.ai/) contributed the pipelines and documentation into diffusers (enabled and supported by [YiYi Xu](https://yiyixuxu.github.io/) and [Sayak Paul](https://sayak.dev/)).
@@ -28,7 +28,7 @@ The abstract from the paper is:
 
 ## Available Pipelines
 
-Each pipeline supports one Computer Vision task, which takes an input RGB image as input and produces a *prediction* of the modality of interest, such as a depth map of the input image. 
+Each pipeline supports one Computer Vision task, which takes an input RGB image as input and produces a *prediction* of the modality of interest, such as a depth map of the input image.
 Currently, the following tasks are implemented:
 
 | Pipeline                                                                                                                                    | Predicted Modalities                                                                                             |                                                                       Demos                                                                        |
@@ -39,7 +39,7 @@ Currently, the following tasks are implemented:
 
 ## Available Checkpoints
 
-The original checkpoints can be found under the [PRS-ETH](https://huggingface.co/prs-eth/) Hugging Face organization. 
+The original checkpoints can be found under the [PRS-ETH](https://huggingface.co/prs-eth/) Hugging Face organization.
 
 <Tip>
 
@@ -49,11 +49,11 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)
 
 <Tip warning={true}>
 
-Marigold pipelines were designed and tested only with `DDIMScheduler` and `LCMScheduler`. 
+Marigold pipelines were designed and tested only with `DDIMScheduler` and `LCMScheduler`.
 Depending on the scheduler, the number of inference steps required to get reliable predictions varies, and there is no universal value that works best across schedulers.
-Because of that, the default value of `num_inference_steps` in the `__call__` method of the pipeline is set to `None` (see the API reference). 
-Unless set explicitly, its value will be taken from the checkpoint configuration `model_index.json`. 
-This is done to ensure high-quality predictions when calling the pipeline with just the `image` argument. 
+Because of that, the default value of `num_inference_steps` in the `__call__` method of the pipeline is set to `None` (see the API reference).
+Unless set explicitly, its value will be taken from the checkpoint configuration `model_index.json`.
+This is done to ensure high-quality predictions when calling the pipeline with just the `image` argument.
 
 </Tip>
 
 
@@ -37,7 +37,7 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.m
 
 ## Inference with under 8GB GPU VRAM
 
-Run the [`PixArtAlphaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example. 
+Run the [`PixArtAlphaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example.
 
 First, install the [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) library:
 
@@ -75,10 +75,10 @@ with torch.no_grad():
     prompt_embeds, prompt_attention_mask, negative_embeds, negative_prompt_attention_mask = pipe.encode_prompt(prompt)
 ```
 
-Since text embeddings have been computed, remove the `text_encoder` and `pipe` from the memory, and free up som GPU VRAM:
+Since text embeddings have been computed, remove the `text_encoder` and `pipe` from the memory, and free up some GPU VRAM:
 
 ```python
-import gc 
+import gc
 
 def flush():
     gc.collect()
@@ -99,7 +99,7 @@ pipe = PixArtAlphaPipeline.from_pretrained(
 ).to("cuda")
 
 latents = pipe(
-    negative_prompt=None, 
+    negative_prompt=None,
     prompt_embeds=prompt_embeds,
     negative_prompt_embeds=negative_embeds,
     prompt_attention_mask=prompt_attention_mask,
@@ -146,4 +146,3 @@ While loading the `text_encoder`, you set `load_in_8bit` to `True`. You could al
 [[autodoc]] PixArtAlphaPipeline
 	- all
 	- __call__
-	
@@ -39,7 +39,7 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)
 
 ## Inference with under 8GB GPU VRAM
 
-Run the [`PixArtSigmaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example. 
+Run the [`PixArtSigmaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example.
 
 First, install the [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) library:
 
@@ -59,7 +59,6 @@ text_encoder = T5EncoderModel.from_pretrained(
     subfolder="text_encoder",
     load_in_8bit=True,
     device_map="auto",
-
 )
 pipe = PixArtSigmaPipeline.from_pretrained(
     "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
@@ -77,10 +76,10 @@ with torch.no_grad():
     prompt_embeds, prompt_attention_mask, negative_embeds, negative_prompt_attention_mask = pipe.encode_prompt(prompt)
 ```
 
-Since text embeddings have been computed, remove the `text_encoder` and `pipe` from the memory, and free up som GPU VRAM:
+Since text embeddings have been computed, remove the `text_encoder` and `pipe` from the memory, and free up some GPU VRAM:
 
 ```python
-import gc 
+import gc
 
 def flush():
     gc.collect()
@@ -101,7 +100,7 @@ pipe = PixArtSigmaPipeline.from_pretrained(
 ).to("cuda")
 
 latents = pipe(
-    negative_prompt=None, 
+    negative_prompt=None,
     prompt_embeds=prompt_embeds,
     negative_prompt_embeds=negative_embeds,
     prompt_attention_mask=prompt_attention_mask,
@@ -148,4 +147,3 @@ While loading the `text_encoder`, you set `load_in_8bit` to `True`. You could al
 [[autodoc]] PixArtSigmaPipeline
 	- all
 	- __call__
-	
@@ -177,7 +177,7 @@ inpaint = StableDiffusionInpaintPipeline(**text2img.components)
 
 The Stable Diffusion pipelines are automatically supported in [Gradio](https://github.com/gradio-app/gradio/), a library that makes creating beautiful and user-friendly machine learning apps on the web a breeze. First, make sure you have Gradio installed:
 
-```
+```sh
 pip install -U gradio
 ```
 
@@ -209,4 +209,4 @@ gr.Interface.from_pipeline(pipe).launch()
 ```
 
 By default, the web demo runs on a local server. If you'd like to share it with others, you can generate a temporary public
-link by setting `share=True` in `launch()`. Or, you can host your demo on [Hugging Face Spaces](https://huggingface.co/spaces)https://huggingface.co/spaces for a permanent link. 
+link by setting `share=True` in `launch()`. Or, you can host your demo on [Hugging Face Spaces](https://huggingface.co/spaces)https://huggingface.co/spaces for a permanent link.
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
 
 # EDMDPMSolverMultistepScheduler
 
-`EDMDPMSolverMultistepScheduler` is a [Karras formulation](https://huggingface.co/papers/2206.00364) of `DPMSolverMultistep`, a multistep scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
+`EDMDPMSolverMultistepScheduler` is a [Karras formulation](https://huggingface.co/papers/2206.00364) of `DPMSolverMultistepScheduler`, a multistep scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
 
 DPMSolver (and the improved version DPMSolver++) is a fast dedicated high-order solver for diffusion ODEs with convergence order guarantee. Empirically, DPMSolver sampling with only 20 steps can generate high-quality
 samples, and it can generate quite good samples even in 10 steps.
 
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
 
 # DPMSolverMultistepScheduler
 
-`DPMSolverMultistep` is a multistep scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
+`DPMSolverMultistepScheduler` is a multistep scheduler from [DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps](https://huggingface.co/papers/2206.00927) and [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](https://huggingface.co/papers/2211.01095) by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu.
 
 DPMSolver (and the improved version DPMSolver++) is a fast dedicated high-order solver for diffusion ODEs with convergence order guarantee. Empirically, DPMSolver sampling with only 20 steps can generate high-quality
 samples, and it can generate quite good samples even in 10 steps.
 
@@ -36,7 +36,7 @@ Then load and enable the [`DeepCacheSDHelper`](https://github.com/horseee/DeepCa
   image = pipe("a photo of an astronaut on a moon").images[0]
 ```
 
-The `set_params` method accepts two arguments: `cache_interval` and `cache_branch_id`. `cache_interval` means the frequency of feature caching, specified as the number of steps between each cache operation. `cache_branch_id` identifies which branch of the network (ordered from the shallowest to the deepest layer) is responsible for executing the caching processes. 
+The `set_params` method accepts two arguments: `cache_interval` and `cache_branch_id`. `cache_interval` means the frequency of feature caching, specified as the number of steps between each cache operation. `cache_branch_id` identifies which branch of the network (ordered from the shallowest to the deepest layer) is responsible for executing the caching processes.
 Opting for a lower `cache_branch_id` or a larger `cache_interval` can lead to faster inference speed at the expense of reduced image quality (ablation experiments of these two hyperparameters can be found in the [paper](https://arxiv.org/abs/2312.00858)). Once those arguments are set, use the `enable` or `disable` methods to activate or deactivate the `DeepCacheSDHelper`.
 
 <div class="flex justify-center">