You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 28, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+15-6Lines changed: 15 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ This sample project demonstrates how to build a scalable, document data extracti
4
4
5
5
This approach takes advantage of the following techniques for document data extraction:
6
6
7
-
-[Using Azure OpenAI GPT-4 Omni vision capabilities to extract data from PDF files by converting them to images](https://github.com/Azure-Samples/azure-openai-gpt-4-vision-pdf-extraction-sample)
7
+
-[Using Azure OpenAI GPT-4o to extract structured JSON data from PDF documents by converting them to images](https://github.com/Azure-Samples/azure-openai-gpt-4-vision-pdf-extraction-sample)
8
8
9
9
## Pre-requisites - Understanding
10
10
@@ -18,7 +18,7 @@ Before continuing with this sample, please ensure that you have a level of under
@@ -50,8 +50,8 @@ Below is an illustration of how the pipeline may integrate into an intelligent a
50
50
51
51
-[**Azure Container Apps**](https://learn.microsoft.com/en-us/azure/azure-functions/functions-deploy-container-apps?tabs=acr%2Cbash&pivots=programming-language-csharp), used to host the containerized functions used in the document processing pipeline.
52
52
-**Note**: By containerizing the functions app, you can integrate this specific orchestration pipeline into an existing microservices architecture or deploy it as a standalone service.
53
-
-[**Azure OpenAI Service**](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview), a managed service for OpenAI GPT models, deploying the latest GPT-4 Omni model to support Vision extraction techniques.
54
-
-**Note**: The GPT-4 Omni model is not available in all Azure OpenAI regions. For more information, see the [Azure OpenAI Service documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#standard-deployment-model-availability).
53
+
-[**Azure AI Services**](https://learn.microsoft.com/en-us/azure/ai-services/what-are-ai-services), a managed service for all Azure AI Services, including Azure OpenAI, deploying the latest GPT-4o model to support vision-based extraction techniques.
54
+
-**Note**: The GPT-4o model is not available in all Azure OpenAI regions. For more information, see the [Azure OpenAI Service documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#standard-deployment-model-availability).
55
55
-[**Azure Storage Account**](https://learn.microsoft.com/en-us/azure/storage/common/storage-introduction), used to store the batch of documents to be processed and the extracted data from the documents. The storage account is also used to store the queue messages for the document processing pipeline.
56
56
-[**Azure Monitor**](https://learn.microsoft.com/en-us/azure/azure-monitor/overview), used to store logs and traces from the document processing pipeline for monitoring and troubleshooting purposes.
57
57
-[**Azure Container Registry**](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-intro), used to store the container images for the document processing pipeline service that will be consumed by Azure Container Apps.
@@ -108,15 +108,24 @@ To setup an environment locally, simply run the [Setup-Environment.ps1](./Setup-
108
108
> [!NOTE]
109
109
> The `-IsLocal` parameter is used to determine whether the complete containerized deployment is made in Azure, or whether to deploy the necessary components to Azure that will support a local development environment. The `-SkipInfrastructure` parameter is used to skip the deployment of the core infrastructure components if they are already deployed.
110
110
111
-
When configured for local development, you will need to grant the following role-based access to your identity scoped to the specific Azure resources:
111
+
When configured for local development, you will be granted the following role-based access to your identity scoped to the specific Azure resources:
112
112
113
113
-**Azure Container Registry**:
114
114
-**Role**: AcrPull
115
+
-**Role**: AcrPush
115
116
-**Azure Storage Account**:
117
+
-**Role**: Storage Account Contributor
116
118
-**Role**: Storage Blob Data Contributor
119
+
-**Role**: Storage File Data Privileged Contributor
120
+
-**Role**: Storage Table Data Contributor
117
121
-**Role**: Storage Queue Data Contributor
122
+
-**Azure Key Vault**:
123
+
-**Role**: Key Vault Administrator
118
124
-**Azure OpenAI Service**:
119
-
-**Role**: Cognitive Services OpenAI User
125
+
-**Role**: Cognitive Services Contributor
126
+
-**Role**: Cognitive Services OpenAI Contributor
127
+
-**Azure AI Hub/Project**:
128
+
-**Role**: Azure ML Data Scientist
120
129
121
130
With the local development environment setup, you can open the solution in Visual Studio Code using the Dev Container. The Dev Container contains all the necessary tools and dependencies to run the sample project with F5 debugging support.
0 commit comments