Skip to content

Commit a34dcf6

Browse files
author
Andrei Kochin
authored
Merge branch 'latest' into dark
2 parents f2c4571 + 3ca6562 commit a34dcf6

39 files changed

+10088
-7566
lines changed

.ci/ignore_treon_docker.txt

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,7 @@ notebooks/decidiffusion-image-generation/decidiffusion-image-generation.ipynb
2626
notebooks/pix2struct-docvqa/pix2struct-docvqa.ipynb
2727
notebooks/fast-segment-anything/fast-segment-anything.ipynb
2828
notebooks/latent-consistency-models-image-generation/latent-consistency-models-image-generation.ipynb
29-
notebooks/latent-consistency-models-image-generation/lcm-lora-controlnet.ipynb
30-
notebooks/latent-consistency-models-image-generation/latent-consistency-models-optimum-demo.ipynb
29+
notebooks/lcm-lora-controlnet/lcm-lora-controlnet.ipynb
3130
notebooks/qrcode-monster/qrcode-monster.ipynb
3231
notebooks/speculative-sampling/speculative-sampling.ipynb
3332
notebooks/distil-whisper-asr/distil-whisper-asr.ipynb
@@ -40,6 +39,7 @@ notebooks/stable-diffusion-ip-adapter/stable-diffusion-ip-adapter.ipynb
4039
notebooks/kosmos2-multimodal-large-language-model/kosmos2-multimodal-large-language-model.ipynb
4140
notebooks/photo-maker/photo-maker.ipynb
4241
notebooks/openvoice/openvoice.ipynb
42+
notebooks/openvoice2-and-melotts/openvoice2-and-melotts.ipynb
4343
notebooks/surya-line-level-text-detection/surya-line-level-text-detection.ipynb
4444
notebooks/instant-id/instant-id.ipynb
4545
notebooks/stable-diffusion-keras-cv/stable-diffusion-keras-cv.ipynb
@@ -87,4 +87,6 @@ notebooks/omniparser/omniparser.ipynb
8787
notebooks/olmocr-pdf-vlm/olmocr-pdf-vlm.ipynb
8888
notebooks/minicpm-o-omnimodal-chatbot/minicpm-o-omnimodal-chatbot.ipynb
8989
notebooks/kokoro/kokoro.ipynb
90-
notebooks/qwen2.5-omni-chatbot/qwen2.5-omni-chatbot.ipynb
90+
notebooks/qwen2.5-omni-chatbot/qwen2.5-omni-chatbot.ipynb
91+
notebooks/intern-video2-classiciation/intern-video2-classification.ipynb
92+
notebooks/flex.2-image-generation/flex.2-image-generation.ipynb

.ci/skipped_notebooks.yml

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -166,13 +166,7 @@
166166
- macos-13
167167
- ubuntu-22.04
168168
- windows-2019
169-
- notebook: notebooks/latent-consistency-models-image-generation/lcm-lora-controlnet.ipynb
170-
skips:
171-
- os:
172-
- macos-13
173-
- ubuntu-22.04
174-
- windows-2019
175-
- notebook: notebooks/latent-consistency-models-image-generation/latent-consistency-models-optimum-demo.ipynb
169+
- notebook: notebooks/lcm-lora-controlnet/lcm-lora-controlnet.ipynb
176170
skips:
177171
- os:
178172
- macos-13
@@ -536,9 +530,23 @@
536530
- macos-13
537531
- ubuntu-22.04
538532
- windows-2019
539-
- notebook: "notebooks/deepseek-vl2/deepseek-vl2.ipynb"
533+
- notebook: notebooks/deepseek-vl2/deepseek-vl2.ipynb
540534
skips:
541535
- os:
542536
- macos-13
543537
- ubuntu-22.04
544-
- windows-2019
538+
- windows-2019
539+
- notebook: notebooks/intern-video2-classiciation/intern-video2-classification.ipynb
540+
skips:
541+
- os:
542+
- macos-13
543+
- ubuntu-22.04
544+
- windows-2019
545+
- notebook: notebooks/flex.2-image-generation/flex.2-image-generation.ipynb
546+
skips:
547+
- python:
548+
- "3.9"
549+
- notebook: notebooks/openvoice2-and-melotts/openvoice2-and-melotts.ipynb
550+
skips:
551+
- os:
552+
- macos-13

.ci/spellcheck/.pyspelling.wordlist.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ BLACKBOX
8585
boolean
8686
CatVTON
8787
CentOS
88+
centric
8889
CFG
8990
charlist
9091
charlists
@@ -405,6 +406,7 @@ intel
405406
interactable
406407
InternLM
407408
internlm
409+
InternVideo
408410
Interpolative
409411
interpretable
410412
invertible
@@ -548,6 +550,7 @@ md
548550
MediaPipe
549551
medprob
550552
mel
553+
MeloTTS
551554
Mels
552555
MERCHANTABILITY
553556
MF
@@ -1082,6 +1085,7 @@ vec
10821085
VegaRT
10831086
verovio
10841087
videpth
1088+
ViFM
10851089
VIO
10861090
virtualenv
10871091
VisCPM

.docker/Pipfile.lock

Lines changed: 5 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

notebooks/README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949
- [Text-to-image generation using PhotoMaker and OpenVINO](./photo-maker/photo-maker.ipynb)
5050
- [Multimodal assistant with Phi-4-multimodal and OpenVINO](./phi-4-multimodal/phi-4-multimodal.ipynb)
5151
- [Visual-language assistant with Phi3-Vision and OpenVINO](./phi-3-vision/phi-3-vision.ipynb)
52+
- [Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO](./openvoice2-and-melotts/openvoice2-and-melotts.ipynb)
5253
- [Voice tone cloning with OpenVoice and OpenVINO](./openvoice/openvoice.ipynb)
5354
- [Running OpenCLIP models using OpenVINO™](./open-clip/open-clip.ipynb)
5455
- [Screen Parsing with OmniParser-v2.0 and OpenVINO](./omniparser/omniparser.ipynb)
@@ -78,7 +79,7 @@
7879
- [Visual-language assistant with LLaVA Next and OpenVINO](./llava-next-multimodal-chatbot/llava-next-multimodal-chatbot.ipynb)
7980
- [Visual-language assistant with LLaVA and Optimum Intel OpenVINO integration](./llava-multimodal-chatbot/llava-multimodal-chatbot-optimum.ipynb)
8081
- [Visual-language assistant with LLaVA and OpenVINO Generative API](./llava-multimodal-chatbot/llava-multimodal-chatbot-genai.ipynb)
81-
- [Text-to-Image Generation with LCM LoRA and ControlNet Conditioning](./latent-consistency-models-image-generation/lcm-lora-controlnet.ipynb)
82+
- [Text-to-Image Generation with LCM LoRA and ControlNet Conditioning](./lcm-lora-controlnet/lcm-lora-controlnet.ipynb)
8283
- [Image generation with Latent Consistency Model and OpenVINO](./latent-consistency-models-image-generation/latent-consistency-models-image-generation.ipynb)
8384
- [Kosmos-2: Multimodal Large Language Model and OpenVINO](./kosmos2-multimodal-large-language-model/kosmos2-multimodal-large-language-model.ipynb)
8485
- [Multimodal understanding and generation with Janus-Pro and OpenVINO](./janus-multimodal-generation/janus-multimodal-generation.ipynb)
@@ -147,6 +148,7 @@
147148
- [Line-level text detection with Surya](./surya-line-level-text-detection/surya-line-level-text-detection.ipynb)
148149
- [Convert a PyTorch Model to OpenVINO™ IR](./pytorch-to-openvino/pytorch-to-openvino.ipynb)
149150
- [Convert a PaddlePaddle Model to OpenVINO™ IR](./paddle-to-openvino/paddle-to-openvino-classification.ipynb)
151+
- [Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO](./openvoice2-and-melotts/openvoice2-and-melotts.ipynb)
150152
- [Voice tone cloning with OpenVoice and OpenVINO](./openvoice/openvoice.ipynb)
151153
- [OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines](./openvino-tokenizers/openvino-tokenizers.ipynb)
152154
- [Object detection and masking from prompts with GroundedSAM (GroundingDINO + SAM) and OpenVINO](./grounded-segment-anything/grounded-segment-anything.ipynb)
@@ -178,6 +180,7 @@
178180
- [Person Tracking with OpenVINO™](./person-tracking-webcam/person-tracking.ipynb)
179181
- [Person Counting System using YOLOV8 and OpenVINO™](./person-counting-webcam/person-counting.ipynb)
180182
- [PaddleOCR with OpenVINO™](./paddle-ocr-webcam/paddle-ocr-webcam.ipynb)
183+
- [Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO](./openvoice2-and-melotts/openvoice2-and-melotts.ipynb)
181184
- [Voice tone cloning with OpenVoice and OpenVINO](./openvoice/openvoice.ipynb)
182185
- [Live Object Detection with OpenVINO™](./object-detection-webcam/object-detection.ipynb)
183186
- [CLIP model with Jina CLIP and OpenVINO](./jina-clip/jina-clip.ipynb)
@@ -250,6 +253,7 @@
250253
- [Text-to-speech (TTS) with Parler-TTS and OpenVINO](./parler-tts-text-to-speech/parler-tts-text-to-speech.ipynb)
251254
- [Text-to-Speech synthesis using OuteTTS and OpenVINO](./outetts-text-to-speech/outetts-text-to-speech.ipynb)
252255
- [Optical Character Recognition (OCR) with OpenVINO™](./optical-character-recognition/optical-character-recognition.ipynb)
256+
- [Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO](./openvoice2-and-melotts/openvoice2-and-melotts.ipynb)
253257
- [Voice tone cloning with OpenVoice and OpenVINO](./openvoice/openvoice.ipynb)
254258
- [Running OpenCLIP models using OpenVINO™](./open-clip/open-clip.ipynb)
255259
- [Universal Segmentation with OneFormer and OpenVINO](./oneformer-segmentation/oneformer-segmentation.ipynb)
@@ -284,13 +288,14 @@
284288
- [Visual-language assistant with LLaVA and Optimum Intel OpenVINO integration](./llava-multimodal-chatbot/llava-multimodal-chatbot-optimum.ipynb)
285289
- [Visual-language assistant with LLaVA and OpenVINO Generative API](./llava-multimodal-chatbot/llava-multimodal-chatbot-genai.ipynb)
286290
- [Text-to-Speech synthesis using Llasa and OpenVINO](./llasa-speech-synthesis/llasa-speech-synthesis.ipynb)
287-
- [Text-to-Image Generation with LCM LoRA and ControlNet Conditioning](./latent-consistency-models-image-generation/lcm-lora-controlnet.ipynb)
291+
- [Text-to-Image Generation with LCM LoRA and ControlNet Conditioning](./lcm-lora-controlnet/lcm-lora-controlnet.ipynb)
288292
- [Image generation with Latent Consistency Model and OpenVINO](./latent-consistency-models-image-generation/latent-consistency-models-image-generation.ipynb)
289293
- [Kosmos-2: Multimodal Large Language Model and OpenVINO](./kosmos2-multimodal-large-language-model/kosmos2-multimodal-large-language-model.ipynb)
290294
- [Text-to-Speech synthesis using Kokoro and OpenVINO](./kokoro/kokoro.ipynb)
291295
- [OpenVINO optimizations for Knowledge graphs](./knowledge-graphs-conve/knowledge-graphs-conve.ipynb)
292296
- [Multimodal understanding and generation with Janus-Pro and OpenVINO](./janus-multimodal-generation/janus-multimodal-generation.ipynb)
293297
- [Visual-language assistant with InternVL2 and OpenVINO](./internvl2/internvl2.ipynb)
298+
- [Video Classification with InternVideo2 and OpenVINO](./intern-video2-classiciation/intern-video2-classification.ipynb)
294299
- [Image Editing with InstructPix2Pix and OpenVINO](./instruct-pix2pix-image-editing/instruct-pix2pix-image-editing.ipynb)
295300
- [InstantID: Zero-shot Identity-Preserving Generation using OpenVINO](./instant-id/instant-id.ipynb)
296301
- [Inpainting with OpenVINO GenAI](./inpainting-genai/inpainting-genai.ipynb)
@@ -343,6 +348,7 @@
343348
- [Quantization Aware Training with NNCF, using PyTorch framework](./pytorch-quantization-aware-training/pytorch-quantization-aware-training.ipynb)
344349
- [Post-Training Quantization of PyTorch models with NNCF](./pytorch-post-training-quantization-nncf/pytorch-post-training-quantization-nncf.ipynb)
345350
- [Optimize Preprocessing](./optimize-preprocessing/optimize-preprocessing.ipynb)
351+
- [Voice tone cloning with OpenVoice2 and MeloTTS for Text-to-Speech by OpenVINO](./openvoice2-and-melotts/openvoice2-and-melotts.ipynb)
346352
- [Voice tone cloning with OpenVoice and OpenVINO](./openvoice/openvoice.ipynb)
347353
- [OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines](./openvino-tokenizers/openvino-tokenizers.ipynb)
348354
- [Quantize NLP models with Post-Training Quantization ​in NNCF](./language-quantize-bert/language-quantize-bert.ipynb)
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Image generation with universal control using Flex.2 and OpenVINO
2+
3+
<div class="alert alert-block alert-danger"> <b>Important note:</b> This notebook requires python >= 3.11. Please make sure that your environment fulfill to this requirement before running it </div>
4+
5+
Flex.2 is flexible text-to-image diffusion model based on Flux model architecture with built in support inpainting and universal control - model accepts pose, line, and depth inputs.
6+
7+
<img src="https://github.com/user-attachments/assets/6a9ab66a-387a-4538-8625-2bb3a16072b5" width="1024">
8+
9+
More details about model can be found in [model card](https://huggingface.co/ostris/Flex.2-preview).
10+
11+
In this tutorial we consider how to convert and optimize Flex.2 model using OpenVINO.
12+
13+
>**Note**: Some demonstrated models can require at least 32GB RAM for conversion and running.
14+
15+
### Notebook Contents
16+
17+
In this demonstration, you will learn how to perform text-to-image generation using Flex.2 and OpenVINO.
18+
19+
Example of model work:
20+
21+
![](https://github.com/user-attachments/assets/140685b7-2c5d-4cef-86fb-33df0849ec1a)
22+
23+
The tutorial consists of the following steps:
24+
25+
- Install prerequisites
26+
- Collect Pytorch model pipeline
27+
- Convert model to OpenVINO intermediate representation (IR) format
28+
- Compress weights using NNCF
29+
- Prepare OpenVINO Inference pipeline
30+
- Run Image generation
31+
- Launch interactive demo
32+
33+
## Installation Instructions
34+
35+
This is a self-contained example that relies solely on its own code.</br>
36+
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
37+
For further details, please refer to [Installation Guide](../../README.md).
38+
39+
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/flex.2-image-generation/README.md" />

0 commit comments

Comments
 (0)