|
7 | 7 | "source": [
|
8 | 8 | "# Generating images with AI\n",
|
9 | 9 | "\n",
|
10 |
| - "This notebook demonstrates how to use OpenAI DALL-E 2 to generate images, in combination with other LLM features like text and embedding generation.\n", |
| 10 | + "This notebook demonstrates how to use OpenAI DALL-E 3 to generate images, in combination with other LLM features like text and embedding generation.\n", |
11 | 11 | "\n",
|
12 |
| - "Here, we use Chat Completion to generate a random image description and DALL-E 2 to create an image from that description, showing the image inline.\n", |
| 12 | + "Here, we use Chat Completion to generate a random image description and DALL-E 3 to create an image from that description, showing the image inline.\n", |
13 | 13 | "\n",
|
14 | 14 | "Lastly, the notebook asks the user to describe the image. The embedding of the user's description is compared to the original description, using Cosine Similarity, and returning a score from 0 to 1, where 1 means exact match."
|
15 | 15 | ]
|
|
33 | 33 | "source": [
|
34 | 34 | "// Usual setup: importing Semantic Kernel SDK and SkiaSharp, used to display images inline.\n",
|
35 | 35 | "\n",
|
36 |
| - "#r \"nuget: Microsoft.SemanticKernel, 1.0.0-rc4\"\n", |
| 36 | + "#r \"nuget: Microsoft.SemanticKernel, 1.0.1\"\n", |
| 37 | + "#r \"nuget: System.Numerics.Tensors, 8.0.0\"\n", |
37 | 38 | "#r \"nuget: SkiaSharp, 2.88.3\"\n",
|
38 | 39 | "\n",
|
39 | 40 | "#!import config/Settings.cs\n",
|
|
56 | 57 | "\n",
|
57 | 58 | "The notebook uses:\n",
|
58 | 59 | "\n",
|
59 |
| - "* **OpenAI Dall-E 2** to transform the image description into an image\n", |
| 60 | + "* **OpenAI Dall-E 3** to transform the image description into an image\n", |
60 | 61 | "* **text-embedding-ada-002** to compare your guess against the real image description\n",
|
61 | 62 | "\n",
|
62 | 63 | "**Note:**: For Azure OpenAI, your endpoint should have DALL-E API enabled."
|
|
85 | 86 | "// Load OpenAI credentials from config/settings.json\n",
|
86 | 87 | "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n",
|
87 | 88 | "\n",
|
88 |
| - "// Configure the three AI features: text embedding (using Ada), text completion (using DaVinci 3), image generation (DALL-E 2)\n", |
| 89 | + "// Configure the three AI features: text embedding (using Ada), chat completion, image generation (DALL-E 3)\n", |
89 | 90 | "var builder = Kernel.CreateBuilder();\n",
|
90 | 91 | "\n",
|
91 | 92 | "if(useAzureOpenAI)\n",
|
92 | 93 | "{\n",
|
93 | 94 | " builder.AddAzureOpenAITextEmbeddingGeneration(\"text-embedding-ada-002\", azureEndpoint, apiKey);\n",
|
94 | 95 | " builder.AddAzureOpenAIChatCompletion(model, azureEndpoint, apiKey);\n",
|
95 |
| - " builder.AddAzureOpenAITextToImage(azureEndpoint, apiKey);\n", |
| 96 | + " builder.AddAzureOpenAITextToImage(\"dall-e-3\", azureEndpoint, apiKey);\n", |
96 | 97 | "}\n",
|
97 | 98 | "else\n",
|
98 | 99 | "{\n",
|
|
115 | 116 | "cell_type": "markdown",
|
116 | 117 | "metadata": {},
|
117 | 118 | "source": [
|
118 |
| - "# Generate a (random) image with DALL-E 2\n", |
| 119 | + "# Generate a (random) image with DALL-E 3\n", |
119 | 120 | "\n",
|
120 | 121 | "**genImgDescription** is a Semantic Function used to generate a random image description. \n",
|
121 | 122 | "The function takes in input a random number to increase the diversity of its output.\n",
|
122 | 123 | "\n",
|
123 |
| - "The random image description is then given to **Dall-E 2** asking to create an image." |
| 124 | + "The random image description is then given to **Dall-E 3** asking to create an image." |
124 | 125 | ]
|
125 | 126 | },
|
126 | 127 | {
|
|
159 | 160 | "var imageDescriptionResult = await kernel.InvokeAsync(genImgDescription, new() { [\"input\"] = random });\n",
|
160 | 161 | "var imageDescription = imageDescriptionResult.ToString();\n",
|
161 | 162 | "\n",
|
162 |
| - "// Use DALL-E 2 to generate an image. OpenAI in this case returns a URL (though you can ask to return a base64 image)\n", |
163 |
| - "var imageUrl = await dallE.GenerateImageAsync(imageDescription.Trim(), 512, 512);\n", |
| 163 | + "// Use DALL-E 3 to generate an image. OpenAI in this case returns a URL (though you can ask to return a base64 image)\n", |
| 164 | + "var imageUrl = await dallE.GenerateImageAsync(imageDescription.Trim(), 1024, 1024);\n", |
164 | 165 | "\n",
|
165 |
| - "await SkiaUtils.ShowImage(imageUrl, 512, 512);" |
| 166 | + "await SkiaUtils.ShowImage(imageUrl, 1024, 1024);" |
166 | 167 | ]
|
167 | 168 | },
|
168 | 169 | {
|
|
0 commit comments