diff --git a/README.md b/README.md index aac7526..ab8ad0a 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,11 @@ # Azure Cosmos DB + Azure OpenAI Python Developer Guide -1. [Introduction](/00_Introduction/README.md) -1. [Azure Overview](/01_Azure_Overview/README.md) -1. [Overview of Azure Cosmos DB](/02_Overview_Cosmos_DB/README.md) -1. [Overview of Azure OpenAI](/03_Overview_Azure_OpenAI/README.md) -1. [Overview of AI Concepts](/04_Overview_AI_Concepts/README.md) -1. [Explore the Azure OpenAI models and endpoints (console app)](/05_Explore_OpenAI_models/README.md) -1. [Provision Azure resources](/06_Provision_Azure_Resources/README.md) -1. [Create your first Cosmos DB project](/07_Create_First_Cosmos_DB_Project/README.md) -1. [Load data into Azure Cosmos DB API for MongoDB](/08_Load_Data/README.md) -1. [Use vector search on embeddings in vCore-based Azure Cosmos DB for MongoDB](/09_Vector_Search_Cosmos_DB/README.md) -1. [LangChain](/10_LangChain/README.md) -1. [Backend API](/11_Backend_API/README.md) -1. [Connect the chat user interface with the chatbot API](/12_User_Interface/README.md) -1. [Conclusion](/13_Conclusion/README.md) - -![Azure Cosmos DB + Azure OpenAI Python Developer Guide Architecture Diagram](/06_Provision_Azure_Resources/media/architecture.jpg) +Choose one of the following developer guides: + +- [Azure Cosmos DB for MongoDB (vCore) + Azure OpenAI Python Developer Guide +](/vcore/README.md) +- [Azure Cosmos DB for NoSQL + DiskANN + Azure OpenAI Python Developer Guide +](/diskann/README.md) ## Contributing diff --git a/diskann/00_Introduction/README.md b/diskann/00_Introduction/README.md new file mode 100644 index 0000000..fbdba1b --- /dev/null +++ b/diskann/00_Introduction/README.md @@ -0,0 +1,21 @@ +# Welcome to the Azure Cosmos DB for NoSQL + DiskANN + Azure OpenAI Python Developer Guide + +## Pre-requisites + +- [Azure account and subscription](https://azure.microsoft.com/free/) with Owner permissions +- [Python 3.11 or higher](https://www.python.org/downloads/) +- [Visual Studio Code](https://code.visualstudio.com/download) +- [Python extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-python.python) +- [Jupyter Notebook extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) +- [Docker Desktop](https://www.docker.com/products/docker-desktop/) with [WSL 2 backend (if on Windows)](https://learn.docker.com/desktop/wsl/) +- [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli) +- [Bicep CLI](https://learn.microsoft.com/azure/azure-resource-manager/bicep/install#install-manually) +- [Powershell](https://learn.microsoft.com/powershell/scripting/install/installing-powershell?view=powershell-7.3) + +## Why use this guide? + +The future of software involves combining AI and data services, also known as intelligent applications. This guide is for developers looking to implement intelligent applications quickly while leveraging existing skills. The content will focus on the developer journey implementing an Azure-based AI-enabled GPT-based chat application that is augmented using data stored in Azure Cosmos DB for NoSQL while leveraging Azure OpenAI services. + +## Introduction + +This guide will walks through the creating intelligent solutions that combines Azure Cosmos DB for NoSQL with vector search capabilities powered by DiskANN and document retrieval with Azure OpenAI services to build a chat bot experience. The guide includes labs that build and deploy a sample chat app using these technologies, with a focus on Azure Cosmos DB for NoSQL, vector search powered by DiskANN, and Azure OpenAI using the Python programming language. For those new to using Azure OpenAI and Vector Search technologies, the guide includes explanations of the core concepts and techniques used when implementing these technologies. diff --git a/diskann/01_Azure_Overview/README.md b/diskann/01_Azure_Overview/README.md new file mode 100644 index 0000000..2eee723 --- /dev/null +++ b/diskann/01_Azure_Overview/README.md @@ -0,0 +1,150 @@ +# Azure Overview + +Millions of customers worldwide trust the Azure platform, and there are over 90,000 Cloud Solution Providers (CSPs) partnered with Microsoft to add extra benefits and services to the Azure platform. By leveraging Azure, organizations can easily modernize their applications, expedite application development, and adapt application requirements to meet the demands of their users. This section provides an overview of Azure, its services, and recommendations on how to get started. + +## Advantages of choosing Azure + +By offering solutions on Azure, ISVs can access one of the largest B2B markets in the world. Through the [Azure Partner Builder's Program](https://partner.microsoft.com/marketing/azure-isv-technology-partners), Microsoft assists ISVs with the tools and platform to offer their solutions for customers to evaluate, purchase, and deploy with just a few clicks of the mouse. + +One of the advantages of choosing Microsoft Azure is access to the Azure AI Services and [Azure OpenAI Service](https://azure.microsoft.com/products/ai-services/openai-service), which empowers developers to integrate advanced artificial intelligence and natural language processing capabilities into their solutions. Developers are able to build, deploy, and manage applications with the language or platform of their choice. With Azure OpenAI Service and other Azure AI services, AI is now available to developers of all skill levels to build scale without constraint. + +Microsoft's development suite includes such tools as the various [Visual Studio](https://visualstudio.microsoft.com/) products, [Azure DevOps](https://dev.azure.com/), [GitHub](https://github.com/), and low-code [Power Apps](https://powerapps.microsoft.com/). All of these contribute to Azure's success and growth through their tight integrations with the Azure platform. Organizations that adopt modern tools are 65% more innovative, according to a [2020 McKinsey & Company report.](https://azure.microsoft.com/mediahandler/files/resourcefiles/developer-velocity-how-software-excellence-fuels-business-performance/Developer-Velocity-How-software-excellence-fuels-business-performance-v4.pdf) + +![This image demonstrates common development tools on the Microsoft cloud platform to expedite application development.](media/ISV-Tech-Builders-tools-white.png "Microsoft cloud tooling") + +To facilitate developers' adoption of Azure, Microsoft offers a [free subscription](https://azure.microsoft.com/free/search/) with $200 credit, applicable for thirty days; year-long access to free quotas for popular services and access to always free Azure service tiers. + +## Introduction to Azure resource management + +The [Azure Fundamentals Microsoft Learn Module](https://learn.microsoft.com/learn/modules/intro-to-azure-fundamentals/) demonstrates the different classifications of Azure Services. Moreover, Azure supports a variety of common tools, such as Visual Studio, PowerShell, and the Azure CLI, to manage Azure environments. + +### The Azure resource management hierarchy + +Azure provides a flexible resource hierarchy to simplify cost management and security. This hierarchy consists of four levels: + +- **[Management groups](https://learn.microsoft.com/azure/governance/management-groups/overview)**: Management groups consolidate multiple Azure subscriptions for compliance and security purposes. +- **Subscriptions**: Subscriptions govern cost control and access management. Azure users cannot provision Azure resources without a subscription. +- **[Resource groups](https://learn.microsoft.com/azure/azure-resource-manager/management/manage-resource-groups-portal)**: Resource groups consolidate the individual Azure resources for a given deployment. All provisioned Azure resources belong to one resource group. In this guide, it will be required to provision a *resource group* in an *subscription* to hold the required resources. + - Resource groups are placed in a geographic location that determines where metadata about that resource group is stored. +- **Resources**: An Azure resource is an instance of a service. An Azure resource belongs to one resource group located in one subscription. + - Most Azure resources are provisioned in a particular region. + + ![This image shows Azure resource scopes.](media/scope-levels.png "Azure resource scopes") + +### Create landing zone + +An [Azure landing zone](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/landing-zone/) is the target environment defined as the final resting place of a cloud migration project. In most projects, the landing zone should be scripted via ARM templates for its initial setup. Finally, it should be customized with PowerShell or the Azure Portal to fit the workload's needs. First-time Azure users will find creating and deploying to DEV and TEST environments easy. + +To help organizations quickly move to Azure, Microsoft provides the Azure landing zone accelerator, which generates a landing zone ARM template according to an organization's core needs, governance requirements, and automation setup. The landing zone accelerator is available in the Azure portal. + +![This image demonstrates the Azure landing zone accelerator in the Azure portal, and how organizations can optimize Azure for their needs and innovate.](media/landing-zone-accelerator.png "Azure landing zone accelerator screenshot") + +### Automating and managing Azure services + +When it comes to managing Azure resources, there are many potential options. [Azure Resource Manager](https://learn.microsoft.com/azure/azure-resource-manager/management/overview) is the deployment and management service for Azure. It provides a management layer that enables users to create, update, and delete resources in Azure subscriptions. Use management features, like access control, locks, and tags, to secure and organize resources after deployment. + +All Azure management tools, including the [Azure CLI](https://learn.microsoft.com/cli/azure/what-is-azure-cli), [Azure PowerShell](https://learn.microsoft.com/powershell/azure/what-is-azure-powershell?view=azps-7.1.0) module, [Azure REST API](https://learn.microsoft.com/rest/api/azure/), and browser-based Portal, interact with the Azure Resource Manager layer and [Identity and access management (IAM)](https://learn.microsoft.com/azure/role-based-access-control/overview) security controls. + + ![This image demonstrates how the Azure Resource Manager provides a robust, secure interface to Azure resources.](media/consistent-management-layer.png "Azure Resource Manager explained") + +Access control to all Azure services is offered via the [Azure role-based access control (Azure RBAC)](https://learn.microsoft.com/azure/role-based-access-control/overview) natively built into the management platform. Azure RBAC is a system that provides fine-grained access management of Azure resources. Using Azure RBAC, it is possible to segregate duties within teams and grant only the amount of access to users that they need to perform their jobs. + +### Azure management tools + +The flexibility and variety of Azure's management tools make it intuitive for any user, irrespective of their skill level with specific technologies. As an individual's skill level and administration needs mature, Azure has the right tools to match those needs. + +![Azure service management tool maturity progression.](media/azure-management-tool-maturity.png "Azure service management tool") + +#### Azure portal + +As a new Azure user, the first resource a person will be exposed to is the Azure Portal. The **Azure Portal** gives developers and architects a view of the state of their Azure resources. It supports extensive user configuration and simplifies reporting. The **[Azure mobile app](https://azure.microsoft.com/get-started/azure-portal/mobile-app/)** provides similar features for users that are away from their main desktop or laptop. + + ![The picture shows the initial Azure service list.](media/azure-portal-services.png "Azure portal Services") + +Azure runs on a common framework of backend resource services, and every action taken in the Azure portal translates into a call to a set of backend APIs developed by the respective engineering team to read, create, modify, or delete resources. + +##### Azure Marketplace + +[Azure Marketplace](https://learn.microsoft.com/marketplace/azure-marketplace-overview) is an online store that contains thousands of IT software applications and services built by industry-leading technology companies. In Azure Marketplace, it is possible to find, try, buy, and deploy the software and services needed to build new solutions and manage the cloud infrastructure. The catalog includes solutions for different industries and technical areas, free trials, and consulting services from Microsoft partners. + +![The picture shows an example of Azure Marketplace search results.](media/azure-marketplace-search-results.png "Azure Marketplace Results") + +##### Evolving + +Moving workloads to Azure alleviates some administrative burdens, but not all. Even though there is no need to worry about the data center, there is still a responsibility for service configuration and user access. Applications will need resource authorization. + +Using the existing command-line tools and REST APIs, it is possible to build custom tools to automate and report resource configurations that do not meet organizational requirements. + +#### Azure PowerShell and CLI + +**Azure PowerShell** and the **Azure CLI** (for Bash shell users) are useful for automating tasks that cannot be performed in the Azure portal. Both tools follow an *imperative* approach, meaning that users must explicitly script the creation of resources in the correct order. + + ![Shows an example of the Azure CLI.](media/azure-cli-example.png "Azure CLI Example") + +There are subtle differences between how each of these tools operates and the actions that can be accomplished. Use the [Azure command-line tool guide](https://learn.microsoft.com/azure/developer/azure-cli/choose-the-right-azure-command-line-tool) to determine the right tool to meet the target goal. + +#### Azure CLI + +It is possible to run the Azure CLI and Azure PowerShell from the [Azure Cloud Shell](https://shell.azure.com), but it does have some limitations. It is also possible to run these tools locally. + +To use the Azure CLI, [download the CLI tools from Microsoft.](https://learn.microsoft.com/cli/azure/install-azure-cli) + +To use the Azure PowerShell cmdlets, install the `Az` module from the PowerShell Gallery, as described in the [installation document.](https://learn.microsoft.com/powershell/azure/install-az-ps?view=azps-6.6.0) + +##### Azure Cloud Shell + +The Azure Cloud Shell provides Bash and PowerShell environments for managing Azure resources imperatively. It also includes standard development tools, like Visual Studio Code, and files are persisted in an Azure Files share. + +Launch the Cloud Shell in a browser at [https://shell.azure.com](https://shell.azure.com). + +#### PowerShell Module + +The Azure portal and Windows PowerShell can be used for managing Azure Cosmos DB for NoSQL. To get started with Azure PowerShell, install the [Azure PowerShell cmdlets](https://learn.microsoft.com/powershell/module/az.cosmosdb/) for Azure Cosmos DB with the following PowerShell command in an administrator-level PowerShell window: + +```PowerShell +Install-Module -Name Az.CosmosDB +``` + +#### Infrastructure as Code + +[Infrastructure as Code (IaC)](https://learn.microsoft.com/devops/deliver/what-is-infrastructure-as-code) provides a way to describe or declare what infrastructure looks like using descriptive code. The infrastructure code is the desired state. The environment will be built when the code runs and completes. One of the main benefits of IaC is that it is human readable. Once the environment code is proven and tested, it can be versioned and saved into source code control. Developers can review the environment changes over time. + +There are a few options of IaC tooling to choose from when provisioning and managing Azure resources. These include Azure-native tools from Microsoft, like ARM templates and Azure Bicep, as well as third-party tools popular within the industry like HashiCorp Terraform. + +##### ARM templates + +[ARM templates](https://learn.microsoft.com/azure/azure-resource-manager/templates/) can deploy Azure resources in a *declarative* manner. Azure Resource Manager can potentially create the resources in an ARM template in parallel. ARM templates can be used to create multiple identical environments, such as development, staging, and production environments. + + ![The picture shows an example of an ARM template JSON export.](media/azure-template-json-example.png "Azure Template JSON") + +##### Bicep + +Reading, updating, and managing the ARM template JSON code can be difficult for a reasonably sized environment. What if there was a tool that translates simple declarative statements into ARM templates? Better yet, what if there was a tool that took existing ARM templates and translated them into a simple configuration? [Bicep](https://learn.microsoft.com/azure/azure-resource-manager/bicep/overview) is a domain-specific language (DSL) that uses a declarative syntax to deploy Azure resources. Bicep files define the infrastructure to deploy to Azure and then use that file throughout the development lifecycle to repeatedly deploy infrastructure changes. Resources are deployed consistently. + +By using the Azure CLI it is possible to decompile ARM templates to Bicep using the following: + +```powershell +az bicep decompile --file template.json +``` + +Additionally, the [Bicep playground](https://aka.ms/bicepdemo) tool can perform similar decompilation of ARM templates. + +![Sample Bicep code that deploys Azure Cosmos DB for NoSQL](media/bicep_code.png) + +##### Terraform + +[Hashicorp Terraform](https://www.terraform.io/) is an open-source tool for provisioning and managing cloud infrastructure resources. [Terraform](https://learn.microsoft.com/azure/developer/terraform/overview) simplifies the deployment of Azure services, including Azure Kubernetes Service, Azure Cosmos DB, and Azure AI, through infrastructure-as-code to automate provisioning and management of Azure services. Terraform is also adept at deploying infrastructure across multiple cloud providers. It enables developers to use consistent tooling to manage each infrastructure definition. + +![Sample Terraform code that deploys Azure Cosmos DB for NoSQL](media/terraform_code.png) + +#### Other tips + +Azure administrators should consult with cloud architects and financial and security personnel to develop an effective organizational hierarchy of resources. + +Here are some best practices to follow for Azure deployments. + +- **Utilize Management Groups** Create at least three levels of management groups. + +- **Adopt a naming convention:** Azure resource names should include business details, such as the organization department, and operational details for IT personnel, like the workload. Defining an [Azure resource naming convention](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/azure-best-practices/resource-naming) will help the organization standardize on a common naming convention that will help better identify resources once created. + +- **Adopt other Azure governance tools:** Azure provides mechanisms such as [resource tags](https://learn.microsoft.com/azure/azure-resource-manager/management/tag-resources?tabs=json) and [resource locks](https://learn.microsoft.com/azure/azure-resource-manager/management/lock-resources?tabs=json) to facilitate compliance, cost management, and security. diff --git a/01_Azure_Overview/media/2023-12-31-14-19-28.png b/diskann/01_Azure_Overview/media/2023-12-31-14-19-28.png similarity index 100% rename from 01_Azure_Overview/media/2023-12-31-14-19-28.png rename to diskann/01_Azure_Overview/media/2023-12-31-14-19-28.png diff --git a/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png b/diskann/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png similarity index 100% rename from 01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png rename to diskann/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png diff --git a/01_Azure_Overview/media/azure-cli-example.png b/diskann/01_Azure_Overview/media/azure-cli-example.png similarity index 100% rename from 01_Azure_Overview/media/azure-cli-example.png rename to diskann/01_Azure_Overview/media/azure-cli-example.png diff --git a/01_Azure_Overview/media/azure-management-tool-maturity.png b/diskann/01_Azure_Overview/media/azure-management-tool-maturity.png similarity index 100% rename from 01_Azure_Overview/media/azure-management-tool-maturity.png rename to diskann/01_Azure_Overview/media/azure-management-tool-maturity.png diff --git a/diskann/01_Azure_Overview/media/azure-marketplace-search-results.png b/diskann/01_Azure_Overview/media/azure-marketplace-search-results.png new file mode 100644 index 0000000..c52b767 Binary files /dev/null and b/diskann/01_Azure_Overview/media/azure-marketplace-search-results.png differ diff --git a/01_Azure_Overview/media/azure-portal-services.png b/diskann/01_Azure_Overview/media/azure-portal-services.png similarity index 100% rename from 01_Azure_Overview/media/azure-portal-services.png rename to diskann/01_Azure_Overview/media/azure-portal-services.png diff --git a/01_Azure_Overview/media/azure-services.png b/diskann/01_Azure_Overview/media/azure-services.png similarity index 100% rename from 01_Azure_Overview/media/azure-services.png rename to diskann/01_Azure_Overview/media/azure-services.png diff --git a/diskann/01_Azure_Overview/media/azure-template-json-example.png b/diskann/01_Azure_Overview/media/azure-template-json-example.png new file mode 100644 index 0000000..e976258 Binary files /dev/null and b/diskann/01_Azure_Overview/media/azure-template-json-example.png differ diff --git a/diskann/01_Azure_Overview/media/bicep_code.png b/diskann/01_Azure_Overview/media/bicep_code.png new file mode 100644 index 0000000..c9dba24 Binary files /dev/null and b/diskann/01_Azure_Overview/media/bicep_code.png differ diff --git a/01_Azure_Overview/media/consistent-management-layer.png b/diskann/01_Azure_Overview/media/consistent-management-layer.png similarity index 100% rename from 01_Azure_Overview/media/consistent-management-layer.png rename to diskann/01_Azure_Overview/media/consistent-management-layer.png diff --git a/01_Azure_Overview/media/landing-zone-accelerator.png b/diskann/01_Azure_Overview/media/landing-zone-accelerator.png similarity index 100% rename from 01_Azure_Overview/media/landing-zone-accelerator.png rename to diskann/01_Azure_Overview/media/landing-zone-accelerator.png diff --git a/01_Azure_Overview/media/scope-levels.png b/diskann/01_Azure_Overview/media/scope-levels.png similarity index 100% rename from 01_Azure_Overview/media/scope-levels.png rename to diskann/01_Azure_Overview/media/scope-levels.png diff --git a/diskann/01_Azure_Overview/media/terraform_code.png b/diskann/01_Azure_Overview/media/terraform_code.png new file mode 100644 index 0000000..7de4b86 Binary files /dev/null and b/diskann/01_Azure_Overview/media/terraform_code.png differ diff --git a/diskann/02_Overview_Cosmos_DB/README.md b/diskann/02_Overview_Cosmos_DB/README.md new file mode 100644 index 0000000..62d85f8 --- /dev/null +++ b/diskann/02_Overview_Cosmos_DB/README.md @@ -0,0 +1,25 @@ +# Overview of Azure Cosmos DB + +[Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/introduction) is a globally distributed database for storing and querying both NoSQL and vector data, with a serverless option. It has multiple APIs, the most notable being the native NoSQL document API and MongoDB API. It provides turnkey global distribution, elastic and dynamic scaling of throughput and storage, and a comprehensive SLA (service level agreement) for single-digit millisecond latency and 99.999% high-availability. + +## Azure Cosmos DB and AI + +The surge of AI-powered applications has led to the need to integrate operational data from multiple data stores, introducing another layer of complexity as each data store tends to have its own workflow and operational performance. Azure Cosmos DB simplifies this process by providing a unified platform for all data types, including AI data. In particular, its support for vector storage and retrieval is a game-changer for generative AI applications. By representing complex data elements like text, images, or sound as high-dimensional vectors, Azure Cosmos DB allows for efficient storage, indexing, and querying of these vectors, which is crucial for many generative AI tasks. + +Unlike traditional databases requiring separate workarounds for different data types, Azure Cosmos DB supports multiple data models within a single, integrated environment. This simplification means you can leverage the same robust platform for all your AI data needs. Many AI applications rely on external stand-alone vector stores, which can be cumbersome to manage and maintain. Azure Cosmos DB's native support for vector storage and retrieval eliminates the need for these external stores as all the application's data is located in a single place thus streamlining the development and deployment of AI applications. These features enable the building, deploying, and scaling of AI applications to be more efficient and reliable, making Azure Cosmos DB an ideal choice for handling the complex data requirements of modern generative AI solutions. + +## Azure Cosmos DB for NoSQL + +The focus for this developer guide is [Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/) and [Vector Search](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search). + +### Azure Cosmos DB for NoSQL capacity modes + +Azure Cosmos DB offers three capacity modes: provisioned throughput, serverless and autoscale modes. creating an Azure Cosmos DB account, it's essential to evaluate the workload's characteristics in order to choose the appropriate mode to optimize both performance and cost efficiency. + +[**Serverless mode**](https://learn.microsoft.com/azure/cosmos-db/serverless) offers a more flexible and pay-as-you-go approach, where only the Request Units consumed are billed. This is particularly advantageous for applications with sporadic or unpredictable usage patterns, as it eliminates the need to provision resources upfront. + +[**Provisioned throughput mode**](https://learn.microsoft.com/azure/cosmos-db/set-throughput) allocates a fixed amount of resources, measured in [Request Units per second (RUs/s)](https://learn.microsoft.com/azure/cosmos-db/request-units), which is ideal for applications with predictable and steady workloads. This ensures consistent performance and can be more cost-effective when there is a constant or high demand for database operations. RU/s can be set at both the database and container levels, allowing for fine-grained control over resource allocation. + +[**Autoscale mode**](https://learn.microsoft.com/azure/cosmos-db/provision-throughput-autoscale) builds upon the provisioned throughput mode but allows for the database or container automatically and instantly scale up or down resources based on demand, ensuring that the application can handle varying workloads efficiently. When configuring autoscale, a maximum (Tmax) value threshold is set for a predictable maximum cost. This mode is suitable for applications with fluctuating usage patterns or infrequently used applications. + +[**Dynamic scaling**](https://learn.microsoft.com/azure/cosmos-db/autoscale-per-partition-region) allows for the automatic and independent scaling of non-uniform workloads across regions and partitions according to usage patterns. For instance, in a disaster recovery configuration with two regions, the primary region may experience high traffic while the secondary region can scale down to idle, thereby saving costs. This approach is also highly effective for multi-regional applications, where traffic patterns fluctuate based on the time of day in each region. diff --git a/diskann/03_Overview_Azure_OpenAI/README.md b/diskann/03_Overview_Azure_OpenAI/README.md new file mode 100644 index 0000000..45ae467 --- /dev/null +++ b/diskann/03_Overview_Azure_OpenAI/README.md @@ -0,0 +1,133 @@ +# Overview of Azure OpenAI and AI Services + +Azure OpenAI is a collaboration between Microsoft Azure and OpenAI, a leading research organization in artificial intelligence. It is a cloud-based platform that enables developers and data scientists to build and deploy AI models quickly and easily. With Azure OpenAI, users can access a wide range of AI tools and technologies to create intelligent applications, including natural language processing, computer vision, and deep learning. + +Azure OpenAI is designed to accelerate the development of AI applications, allowing users to focus on creating innovative solutions that deliver value to their organizations and customers. + +Here are ways that Azure OpenAI can help developers: + +- **Simplified integration** - Simple and easy-to-use APIs for tasks such as text generation, summarization, sentiment analysis, language translation, and more. +- **Pre-trained models** - AI models that are already fine-tuned on vast amounts of data making it easier for developers to leverage the power of AI without having to train their own models from scratch. +- **Customization** - Developers can also fine-tune the included pre-trained models with their own data with minimal coding, providing an opportunity to create more personalized and specialized AI applications. +- **Documentation and resources** - Azure OpenAI provides comprehensive documentation and resources to help developers get started quickly. +- **Scalability and reliability** - Hosted on Microsoft Azure, the OpenAI service provides robust scalability and reliability that developers can leverage to deploy their applications. +- **Responsible AI** - Azure OpenAI promotes responsible AI by adhering to ethical principles, providing explainability tools, governance features, diversity and inclusion support, and collaboration opportunities. These measures help ensure that AI models are unbiased, explainable, trustworthy, and used in a responsible and compliance manner. +- **Community support** - With an active developer community developers can seek help via forums and other community support channels. + +## Comparison of Azure OpenAI and OpenAI + +Azure OpenAI Service gives customers advanced language AI with OpenAI GPT-4, GPT-3, Codex, DALL-E, and Whisper models with the security and enterprise promise of Azure. Azure OpenAI co-develops the APIs with OpenAI, ensuring compatibility and a smooth transition from one to the other. + +With Azure OpenAI, customers get the security capabilities of Microsoft Azure while running the same models as OpenAI. Azure OpenAI offers private networking, regional availability, and responsible AI content filtering. + +## Azure OpenAI Data Privacy and Security + +Azure OpenAI stores and processes data to provide the service and to monitor for uses that violate the applicable product terms. Azure OpenAI is fully controlled by Microsoft. Microsoft hosts the OpenAI models in Microsoft Azure for the usage of Azure OpenAI, and does not interact with any services operated by OpenAI. + +Here are a few important things to know in regards to the security and privacy of prompts (inputs) and completions (outputs), embeddings, and training data when using Azure OpenAI: + +- are NOT available to other customers. +- are NOT available to OpenAI. +- are NOT used to improve OpenAI models. +- are NOT used to improve any Microsoft or 3rd party products or services. +- are NOT used for automatically improving Azure OpenAI models for use in the deployed resource (The models are stateless, unless explicitly fine-tuning models with explicitly provided training data). +- Fine-tuned Azure OpenAI models are available exclusively for the account in which it was created. + +## Azure AI Platform + +Developers can use the power of AI, cloud-scale data, and cloud-native app development to create highly differentiated digital experiences and establish leadership among competitors. Build or modernize intelligent applications that take advantage of industry-leading AI technology and leverage real-time data and analytics to deliver adaptive, responsive, and personalized experiences. + +The Azure platform of managed AI, containers, and database services, along with offerings developed by or in partnership with key software vendors, enables developers to build, deploy, and scale applications with speed, flexibility, and enterprise-grade security. This platform has been used by market leaders like The NBA, H&R Block, Real Madrid Football Club, Bosch, and Nuance to develop their own intelligent apps. + +Developers can use Azure AI Services, along with other Azure services, to build and modernize intelligent apps on Azure. Examples of this could be: + +- Build new with Azure Kubernetes Service or Azure Container Apps, Azure Cosmos DB, and Azure AI Services +- Modernize with Azure Kubernetes Service, Azure SQL or Azure Database for PostgresSQL, and Azure AI Services + +### Azure AI Services + +While this guide focuses on building intelligent apps using Azure OpenAI combined with Azure Cosmos DB for NoSQL, the Azure AI Platform consists of many additional AI services. Each AI service is built to fit a specific AI and/or Machine Learning (ML) need. + +Here's a list of the AI services within the [Azure AI platform](https://learn.microsoft.com/azure/ai-services/what-are-ai-services): + +| Service | Description | +| --- | --- | +| Azure AI Search | Bring AI-powered cloud search to mobile and web apps. | +| Azure OpenAI | Perform a wide variety of natural language tasks. | +| Bot Service | Create bots and connect them across channels. | +| Content Safety | An AI service that detects unwanted contents. | +| Custom Vision | Customize image recognition to fit the business. | +| Document Intelligence | Turn documents into intelligent data-driven solutions. | +| Face | Detect and identify people and emotions in images. | +| Immersive Reader | Help users read and comprehend text. | +| Language | Build apps with industry-leading natural language understanding capabilities | +| Speech | Speech to text, text to speech, translation and speaker recognition. | +| Translator | Use AI-powered translation technology to translate more than 100 in-use, at-risk, and endangered languages and dialects. | +| Video Indexer | Extract actionable insights from videos. | +| Vision | Analyze content in images and videos. | + +> **Note:** Follow this link for additional tips to help in determining the which Azure AI service is most appropriate for a specific project requirement: + +The tools that used to customize and configure models are different from those used to call the Azure AI services. Out of the box, most Azure AI services allow for sending data and receive insights without any customization. + +For example: + +- Sending an image to the Azure AI Vision service to detect words and phrases or count the number of people in the frame +- Sending an audio file to the Speech service and get transcriptions and translate the speech to text at the same time + +Azure offers a wide range of tools that are designed for different types of users, many of which can be used with Azure AI services. Designer-driven tools are the easiest to use, and are quick to set up and automate, but might have limitations when it comes to customization. The REST APIs and client libraries provide users with more control and flexibility, but require more effort, time, and expertise to build a solution. When using REST APIs and client libraries, there is an expectation that the developer is comfortable working with modern programming languages like C#, Java, Python, JavaScript, or another popular programming language. + +### Azure Machine Learning + +[Azure Machine Learning](https://learn.microsoft.com/azure/machine-learning/overview-what-is-azure-machine-learning?view=azureml-api-2) is a cloud service for accelerating and managing the machine learning (ML) project lifecycle. ML professionals, data scientists, and engineers can use it in their day-to-day workflows to train and deploy models and manage machine learning operations (MLOps). + +Azure Machine Learning can be used to create a model or use a model built from an open-source platform, such as PyTorch, TensorFlow, or scikit-learn. Additionally, MLOps tools help monitor, retrain, and redeploy models. + +ML projects often require a team with a varied skill set to build and maintain. Azure Machine Learning has tools that help enable: + +- Collaboration within a team via shared notebooks, compute resources, serverless compute, data, and environments + +- Developing models for fairness and explainability, tracking and auditability to fulfill lineage and audit compliance requirements + +- Deploying ML models quickly and easily at scale, and manage and govern them efficiently with MLOps + +- Running machine learning workloads anywhere with built-in governance, security, and compliance + +Enterprises working in the Microsoft Azure cloud can use familiar security and role-based access control for infrastructure. A project can be set up to deny access to protected data and select operations. + +#### Azure Machine Learning vs Azure OpenAI + +Many of the Azure AI services are suited to a very specific AI / ML need. The Azure Machine Learning and Azure OpenAI services offer more flexible usage based on the solution requirements. + +Here are a couple differentiators to help determine which of these to services to use when comparing the two: + +- Azure Machine Learning service is appropriate for solutions where a custom model needs to be trained specifically on private data. + +- Azure OpenAI service is appropriate for solutions that require pre-trained models that provide natural language processing or vision services, such as the GPT-3.5, GPT-4o or DALL-E models from OpenAI. + +If the solution requires other more task specific AI features, then one of the other Azure AI services should be considered. + +### Azure AI Studio + +Azure AI Studio is a web portal that brings together multiple Azure AI-related services into a single, unified development environment. + +Specifically, Azure AI Studio combines: + +- The model catalog and prompt flow development capabilities of Azure Machine Learning service. + +- The generative AI model deployment, testing, and custom data integration capabilities of Azure OpenAI service. + +- Integration with Azure AI Services for speech, vision, language, document intelligence, and content safety. + +Azure AI Studio enables teams to collaborate efficiently and effectively on AI projects, such as developing custom copilot applications that use large language models (LLMs). + +![Azure AI Studio screenshot](images/2024-08-11_19-15-33.jpg) + +Tasks accomplished using Azure AI Studio include: + +- Deploying models from the model catalog to real-time inferencing endpoints for client applications to consume. +- Deploying and testing generative AI models in an Azure OpenAI service. +- Integrating data from custom data sources to support a retrieval augmented generation (RAG) approach to prompt engineering for generative AI models. +- Using prompt flow to define workflows that integrate models, prompts, and custom processing. +- Integrating content safety filters into a generative AI solution to mitigate potential harms. +- Extending a generative AI solution with multiple AI capabilities using Azure AI services. diff --git a/diskann/03_Overview_Azure_OpenAI/images/2024-08-11_19-15-33.jpg b/diskann/03_Overview_Azure_OpenAI/images/2024-08-11_19-15-33.jpg new file mode 100644 index 0000000..15190ae Binary files /dev/null and b/diskann/03_Overview_Azure_OpenAI/images/2024-08-11_19-15-33.jpg differ diff --git a/diskann/04_Overview_AI_Concepts/README.md b/diskann/04_Overview_AI_Concepts/README.md new file mode 100644 index 0000000..6a793ad --- /dev/null +++ b/diskann/04_Overview_AI_Concepts/README.md @@ -0,0 +1,178 @@ +# Overview of AI Concepts + +## Large Language Models (LLM) + +A Large Language Models (LLM) is a type of AI that can process and produce natural language text. LLMs are "general purpose" AI models trained using massive amounts of data gathered from various sources; like books, articles, webpages, and images to discover patterns and rules of language. + +LLMs are complex and built using a neural network architecture. They are trained using large amounts of information, and calculate millions of parameters. From a developer perspective, the APIs expose by Azure OpenAI Service enable the LLMs to be easily integrated into enterprise solutions without requiring knowledge of how to build to train the models. + +Understanding the capabilities of what an LLM can do is important when deciding to use it for a solution: + +- **Understand language** - An LLM is a predictive engine that pulls patterns together based on pre-existing text to produce more text. It doesn't understand language or math. +- **Understand facts** - An LLM doesn't have separate modes for information retrieval and creative writing; it simply predicts the next most probable token. +- **Understand manners, emotion, or ethics** - An LLM can't exhibit anthropomorphism or understand ethics. The output of a foundational model is a combination of training data and prompts. + +### Foundational Models + +Foundational Models are specific instances or versions of an LLM. Examples of these would be GPT-3, GPT-4, or Codex. Foundational models are trained and fine-tuned on a large corpus of text, or code in the case of a Codex model instance. + +A foundational model takes in training data in all different formats and uses a transformer architecture to build a general model. Adaptions and specializations can be created to achieve certain tasks via prompts or fine-tuning. + +### Difference between LLM and traditional Natural Language Processing (NLP) + +LLMs and Natural Language Processing (NLP) differs in their approach to understanding and processing language. + +Here are a few things that separate NLPs from LLMs: + +| Traditional NLP | Large Language Models | +| --- | --- | +| One model per capability is needed. | A single model is used for many natural language use cases. | +| Provides a set of labeled data to train the ML model. | Uses many terabytes of unlabeled data in the foundation model. | +| Describes in natural language the desired model responses. | Highly optimized for specific use cases. | + +## Prompting and Prompt Engineering + +### What is a prompt? + +A prompt is an input or instruction provided to an Artificial Intelligence (AI) model to direct its behavior and produce the desired results. The quality and specificity of the prompt are crucial in obtaining precise and relevant outputs. A well-designed prompt can ensure that the AI model generates the desired information or completes the intended task effectively. Some typical prompts include summarization, question answering, text classification, and code generation. + +While there's techniques and patterns used when building an Azure OpenAI solution and writing prompts, the following is a couple simple prompt examples: + +- `List the most popular products in the last quarter.` +- `How many customers are located in the state of California?` + +#### Guidelines for creating robust prompts + +While it can be quick to write basic prompts, it can also be difficult to write more complex prompts to ge the AI to generate the responses necessary. When writing prompts, there are three basic guidelines to follow for creating useful prompts: + +- **Show and tell** - Make it clear what response is desired either through instructions, examples, or a combination of the two. When ranking a list of items in alphabetical order or to classifying a paragraph by sentiment, include these details in the prompt to provided to the model. +- **Provide quality data** - When building a classifier or get the model to follow a pattern, make sure there are enough examples. Be sure to proofread the examples. The model is smart enough to resolve basic spelling mistakes and still provide a meaningful response. Conversely, the model might assume the mistakes are intentional, which can affect the response. +- **Check the settings** - Probability settings, such as `Temperature` and `Top P`, control how deterministic the model is in generating a response. When asking for a response where there's only one right answer, a lower value should be specified for these settings. When looking for a response that are not obvious and needs additional creativity, use higher values. The most common mistake users make with these settings is assuming they control "cleverness" or "creativity" in the model response. + +### What is prompt engineering + +[Prompt engineering](https://learn.microsoft.com/azure/ai-services/openai/concepts/prompt-engineering) is the iterative process of designing, evaluating, and optimizing prompts to produce consistently accurate responses from language models for a particular problem domain. It involves designing and refining the prompts given to an AI model to achieve the desired outputs. Prompt engineers experiment with various prompts, test their effectiveness, and refine them to improve performance. Performance is measured using predefined metrics such as accuracy, relevance, and user satisfaction to assess the impact of prompt engineering. + +### General anatomy of a prompt + +context, input, output indicator +There are a several components that are used in the anatomy of constructing AI prompts: + +- Input + - Instructions + - Primary content + - Supporting content + - System message +- Output + +#### Input + +The Input in prompt engineering is the text or query that is provided to the LLM model. The model will use this prompt as the basis for the completion, or response, that is generated. + +##### Instructions + +Instructions are likely the most commonly used prompt component. Instructions are simply that: instructions to the model on what to do. While simple in concept, they can be complex in practice. + +The following are examples of both simple and more complex instructions: + +| Instruction Type | Prompt | Completion | +| :--- | :--- | :--- | +| Simple | Write an introduction for a weekly newsletter. | Welcome to our weekly newsletter. This week we have a special focus on productivity. We'll be sharing tips and tricks on how to be productive in all areas of your life. We hope you enjoy and please share with your friends! | +| Complex | Write an introduction for a weekly newsletter to my company named Contoso. It should mention the great all hands meeting we had last week and thank the team for all their hard work over the tough past few months. | Welcome to Contoso's weekly newsletter! We had a great all hands meeting last week and I want to thank the team for all their hard work over the tough past few months. We're looking forward to a bright future and I'm confident that with our team's dedication and commitment, we will achieve great things. Thank you for being a part of our team and I look forward to working with you all in the coming weeks and months. Sincerely, [Your name]. | +| Complex | Write an introduction for a weekly newsletter, including the following:
- My company is named contoso
- An exuberant greeting
- Thanking the team for their hard work over a couple tough months
- Positive outlook for the coming quarter
- Signed by the SLT | Hello Contoso team! I wanted to take a moment to thank you for all your hard work over the past few months. It's been a tough couple of months, but you've all pulled together and done an amazing job. I'm really proud of you all. Looking ahead to the next quarter, I'm feeling really positive. I think we're going to achieve some great things. Thanks again, The Senior Leadership Team. | + +##### Primary Content + +Primary content refers to some sort of text that is being processed or transformed by the model. Primary content is typically used in conjunction with instructions. A simple example would be language translation. + +| Prompt | Completion | +| :--- | :--- | +| Can you please tell me how to get to the museum?
Translate to French: | `Pouvez-vous s'il vous plaît me dire comment aller au musée?` | + +Primary content can also be much longer. For example, the primary content could pass the introduction section of text content that could be hundreds of words long. Additionally, the primary content could be int he form of structured data as well, such as in JSON or TSV format. + +##### Supporting Content + +Supporting content is information that the model can utilize to influence the output in some way. It differs from primary content in that it is not the main target of task, but it is typically used along with primary content. Common examples include contextual information such as the current date, the name of the user, user preferences, and so on. + +The following are example prompts and possible completions to show what a prompt with and without supporting content might look like: + +| Supporting Content | Prompt | Completion | +| :--- | :--- | :--- | +| Without | Workshop list: +Title Description
Planning Workshop Kickoff Please join us to kick off our workshop! Hear from our leaders about cross group priorities and learnings for the year ahead. The invitation to this event has already been sent to everyone in […]
Summarize the workshops above, grouped by topic: | Workshop list:
Planning Workshop Kickoff
- Planning Workshop Wrap-up
- Core MeetUp
- Customer
Driven Perspectives
- Impact and Key Focus Areas
- Best practices
- Accelerating app adoption […] | +| With | Workshop list:
Title Description
Planning Workshop Kickoff Please join us to kick off our workshop! Hear from our leaders about cross group priorities and learnings for the year ahead. The invitation to this event has already been sent to everyone in […]
My Important Topics: prompt engineering, search, GPT models
Summarize the workshops above, grouped by my important topics: | 1. Prompt Engineering:
- New prompt engineering tips
- The art of prompt engineering
2. Search:
- Intro to vector search with embedding
3. GPT Models:
- Intro to GPT-4
- GPT-35-Turbo in-depth. | + +The above prompt examples are truncated with `[…]` for brevity. Writing similar prompts would contain more text. + +##### System message + +System message, also called a system prompt, is used by developers of a system to restrict the LLM to a specific set of constraints. This is useful when building enterprise solutions that integrate Azure OpenAI so the AI completion will be restricted to the focus of the enterprise data it's integrated with. + +The following is an example system message that could be used to constrain the LLM in an enterprise solution: + +```text +Your name is "Willie". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy. + +Do not include citations or citation numbers in your responses. Do not include emojis. + +You are designed to answer questions about the products that Cosmic Works sells, the customers that buy them, and the sales orders that are placed by customers. + +If you don't know the answer to a question, respond with "I don't know." + +Only answer questions related to Cosmic Works products, customers, and sales orders. + +If a question is not related to Cosmic Works products, customers, or sales orders, respond with "I only answer questions about Cosmic Works +``` + +#### Output + +The Output is the completion, or response, from the LLM returned as a result to the input prompt given. When an input prompt is given, the language model will process the information and generate an output in the form of text. The text response is the output. + +## Standard Patterns + +### Retrieval Augmentation Generation (RAG) + +[Retrieval Augmentation Generation (RAG)](https://learn.microsoft.com/azure/search/retrieval-augmented-generation-overview) is an architecture that augments the capabilities of a Large Language Model (LLM) like ChatGPT by adding an information retrieval system that provides grounding data. Adding an information retrieval system provides control over grounding data used by an LLM when it formulates a response. + +GPT language models can be fine-tuned to achieve several common tasks such as sentiment analysis and named entity recognition. These tasks generally don't require additional background knowledge. + +The RAG pattern facilitates bringing private proprietary knowledge to the model so that it can perform Question Answering over this content. Remember that Large Language Models are indexed only on public information. For an enterprise solution, RAG architecture means that the generative AI is constrained to enterprise content sourced from vectorized documents, images, audio, and video. + +Because the RAG technique accesses external knowledge sources to complete tasks, it enables more factual consistency, improves the reliability of the generated responses, and helps to mitigate the problem of "_hallucination_". + +In some cases, the RAG process involves a technique called vectorization on the proprietary data. The user prompt is compared to the vector store and only the most relevant/matching pieces of information are returned and stuffed into prompt for the LLM to reason over and provide an answer. The next set of demos will go into this further. + +### Chain of Thought (CoT) + +Instead of splitting a task into smaller steps, with [Chain of Thought (CoT)](https://learn.microsoft.com/azure/ai-services/openai/concepts/advanced-prompt-engineering?pivots=programming-language-chat-completions#chain-of-thought-prompting) the model response is instructed to proceed step-by-step and present all the steps involved. Doing so reduces the possibility of inaccuracy of outcomes and makes assessing the model response easier. + +| System message | Prompt | Response | +| :--- | :--- | :--- | +| You are an AI assistant that helps people find information. | Who was the most decorated (maximum medals) individual athlete in the Olympic games that were held at Sydney? Take a step-by-step approach in your response, cite sources and give reasoning before sharing final answer in the below format: ANSWER is: `` | Step 1: Determine the year of the Sydney Olympic games. The Sydney Olympic games were held in the year 2000.

Step 2: Find a list of individual athletes who won multiple medals in the 2000 Sydney Olympics.

According to the official website of the Olympic Games, there were several individual athletes who won multiple medals in the 2000 Sydney Olympics. Some of the notable names include:
- Ian Thorpe (Australia) - 3 gold, 2 silver, 0 bronze
- Pieter van den Hoogenband (Netherlands) - 2 gold, 0 silver, 2 bronze
- Lenny Krayzelburg (USA) - 3 gold, 0 silver, 0 bronze
- Inge de Bruijn (Netherlands) - 3 gold, 1 silver, 0 bronze

Step 3: Determine the athlete with the most medals. After reviewing the list of individual athletes who won multiple medals in the 2000 Sydney Olympics, we can see that the athlete with the most medals is Ian Thorpe from Australia, who won a total of 5 medals (3 gold and 2 silver).

Therefore, the ANSWER is: Ian Thorpe. | + +### ReAct + +Reasoning and Acting (ReAct) leverages LLMs to generate both reasoning traces and task-specific actions. ReAct combines these to allow for greater synergy between the two. Reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external data sources to gather additional information. When using LLMs to answer questions and verify facts, ReAct is used to overcome issues of hallucination and error propagation by interacting with the additional usage of external data sources. + +### Others + +#### Zero-shot prompting + +Zero-shot prompting is what we would consider the “default”. This is when we provide no examples of inputs/expected outputs to the model to work with. We’re leaving it up to the model to decipher what is needed and how to output it from the instructions. + +Example: + +| Prompt | Completion | +| :--- | :--- | +| Headline: Coach confident injury won't derail Warriors
Topic: | The coach is confident that the injury won't derail the Warriors' season. The team is still focused on their goals and that they will continue to work hard to achieve them. | + +#### Few-shot prompting + +Few-shot prompting is a technique that enables in-context learning for the LLM by providing examples or further information within the prompt to steer the model to generate a better response. Providing additional information in the prompt helps particularly in areas where the data used to train the model isn't enough to generate the desired output from the model. + +Example: + +| Prompt | Completion | +| :--- | :--- | +| Headline: Twins' Correa to use opt-out, test free agency
Topic: Baseball
Headline: Qatar World Cup to have zones for sobering up
Topic: Soccer
Headline: Yates: Fantasy football intel for Week 6
Topic: Football
Headline: Coach confident injury won't derail Warriors
Topic: | Basketball | diff --git a/diskann/05_Explore_OpenAI_models/README.md b/diskann/05_Explore_OpenAI_models/README.md new file mode 100644 index 0000000..a1cca99 --- /dev/null +++ b/diskann/05_Explore_OpenAI_models/README.md @@ -0,0 +1,76 @@ +# Explore the Azure OpenAI models and endpoints (console app) + +## Azure OpenAI Models + +[Azure OpenAI is powered by a diverse set of models](https://learn.microsoft.com/azure/ai-services/openai/concepts/models) with different capabilities. + +| Model | Description | +| -- | --- | +| GPT-4o & GPT-4o mini & GPT-4 Turbo | The latest most capable Azure OpenAI models with multimodal versions, which can accept both text and images as input. | +| GPT-4 | A set of models that improve on GPT-3.5 and can understand and generate natural language and code. | +| GPT-3.5 | A set of models that improve on GPT-3 and can understand and generate natural language and code. | +| Embeddings | A set of models that can convert text into numerical vector form to facilitate text similarity. | +| DALL-E | A series of models that can generate original images from natural language. | +| Whisper | A series of models that can transcribe and translate speech to text. | + +Model availability varies by region. You can look at the [Azure OpenAI "Regional quota limits" documentation for a table that shows model availability](https://learn.microsoft.com/azure/ai-services/openai/quotas-limits#regional-quota-limits). + +### GPT-4o, GPT-4 & GPT-3.5 Models + +GPT-4o and GPT-4 Turbo integrate text and images in a single model, enabling it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions. GPT-4o matches GPT-4 Turbo in English text and coding tasks while offering superior performance in non-English languages and vision tasks, setting new benchmarks for AI capabilities. + +GPT-4 can solve difficult problems with greater accuracy than any of OpenAI's previous models. Like GPT-3.5 Turbo, GPT-4 is optimized for chat and works well for traditional completions tasks. + +The GPT-4o, GPT-4 and GPT-35-Turbo models are language models that are optimized for conversational interfaces. The models behave differently than the older GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the GPT-35-Turbo and GPT-4 models are conversation-in and message-out. The models expect input formatted in a specific chat-like transcript format, and return a completion that represents a model-written message in the chat. While this format was designed specifically for multi-turn conversations, it can also work well for non-chat scenarios too. + +### Embeddings + +Embeddings, such as the `text-embedding-ada-002` model, measure the relatedness of text strings. + +Embeddings are commonly used for the following: + +- **Search** - results are ranked by relevance to a query string +- **Clustering** - text strings are grouped by similarity +- **Recommendations** - items with related text strings are recommended +- **Anomaly detection** - outliers with little relatedness are identified +- **Diversity measurement** - similarity distributions are analyzed +- **Classification** - text strings are classified by their most similar label + +### DALL-E + +DALL-E is a model that can generate an original images from a natural language text description given as input. + +### Whisper + +Whisper is a speech recognition model, designed for general-purpose applications. Trained on an extensive dataset encompassing diverse audio inputs, and operates as a multi-tasking model capable of executing tasks like multilingual speech recognition, speech translation, and language identification. + +## Selecting an LLM + +Before a Large Language Model (LLM) can be implemented into a solution, an LLM model must be chosen. For this the business use case and other aspects to the overall goal of the AI solution will need to be defined. + +Once the business goals of the solution are known, there are a few key considerations to think about: + +- **Business Use Case** - What are the specific tasks the business needs the AI solution to perform? Each LLM is designed for different goals, such as text generation, language translation, image generation, answering questions, code generation, etc. +- **Pricing** - For cases where there may be multiple LLMs to choose from, the [pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) of the LLM could be a factor to consider. For example, when choosing between GPT-3.5 or GPT-4, it may be worth to consider that the overall cost of GPT-4 may be higher than GPT-3.5 for the solution since GPT-4 requires more compute power behind the scenes than GPT-3.5 +- **Accuracy** - For cases where there may be multiple LLMs to choose from, the comparison of accuracy between them may be a factor to consider. For example, GPT-4 offers improvements over GPT-3.5 and depending on the use case, GPT-4 may provide increased accuracy. +- **Quotas and limits** - The Azure OpenAI service does have [quotas and limits](https://learn.microsoft.com/azure/ai-services/openai/quotas-limits) on using the service. This may affect the performance and pricing of the AI solution. Additionally, some of quotas and limits may vary depending on the Azure Region that is used to host the Azure OpenAI service. The potential impact of these on the pricing and performance of the solution will want to be considered in the design phase of the solution. + +## Do I use an out-of-the-box model or a fine-tuned model? + +A base model is a model that hasn't been customized or fine-tuned for a specific use case. Fine-tuned models are customized versions of base models where a model's weights are trained on a unique set of prompts. Fine-tuned models achieve better results on a wider number of tasks without needing to provide detailed examples for in-context learning as part of the completion prompt. + +The [fine-tuning guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/fine-tuning) can be referenced for more information. + +## Explore and use Azure OpenAI models from code + +The `key` and `endpoint` necessary to make API calls to Azure OpenAI can be located on **Azure OpenAI** blade in the Azure Portal on the **Keys and Endpoint** pane. + +![Azure OpenAI Keys and Endpoint pane in the Azure Portal](media/2024-01-09-13-53-51.png) + +## Lab: Explore and use Azure OpenAI models from code + +This labs demonstrates using an Azure OpenAI model to obtain a completion response using Python. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Visit the lab repository to complete [this lab](../Labs/lab_0_explore_and_use_models.ipynb). \ No newline at end of file diff --git a/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png b/diskann/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png similarity index 100% rename from 05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png rename to diskann/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png diff --git a/diskann/06_Provision_Azure_Resources/README.md b/diskann/06_Provision_Azure_Resources/README.md new file mode 100644 index 0000000..4843983 --- /dev/null +++ b/diskann/06_Provision_Azure_Resources/README.md @@ -0,0 +1,28 @@ +# Provision Azure resources (Azure Cosmos DB workspace, Azure OpenAI, etc.) + +As the guide walks you through the concepts of integrating Azure Cosmos DB for NoSQL and Azure OpenAI, the hands-on labs will also guide you through building a sample solution. The focus of this guide and the labs is limited to Azure Cosmos DB for NoSQL, vector search capabilities powered by DiskANN, Azure OpenAI, and the Python programming language. With this focus, the labs include an Azure Bicep template that will deploy the following Azure resources the solution will be deployed to: + +- Azure Resource Group +- Azure Cosmos DB for NoSQL +- Azure OpenAI + - ChatGPT-3.5 `completions` model + - text-embedding-ada-002 model `embeddings` model +- Azure App Service - for hosting the front-end, static SPA web application written in React +- Azure Container App - for hosting the back-end, API application written in Python +- Azure Container Registry - to host Docker images of backend, API application + +## Architecture Diagram + +![Solution architecture diagram showing how the Azure services deployed are connected](media/architecture.jpg) + +Once the Azure resources are provisioned, this guide will walk you through everything that is necessary to build the Back-end API application written in Python. + +The Front-end Web App is a static SPA application written in React. Since React is outside the scope of this guide, the Front-end Web App is pre-built for you and will be configured automatically on deployment. You do not need any experience with React in order to complete the labs in this guide. + +## Lab - Provision Azure Resources + +This lab will walk you through deploying the Azure resources necessary for the solution built in this guide. The deployment will be done using an Azure Bicep template that is configured to provision all the necessary resources. + +> **Note**: You will need an Azure Subscription and have the necessary permissions to provision the Azure resources. + +Please visit the lab repository to complete [this lab](../Labs/deploy/deploy.md). diff --git a/diskann/06_Provision_Azure_Resources/media/architecture.jpg b/diskann/06_Provision_Azure_Resources/media/architecture.jpg new file mode 100644 index 0000000..cb1133f Binary files /dev/null and b/diskann/06_Provision_Azure_Resources/media/architecture.jpg differ diff --git a/diskann/07_Create_First_Cosmos_DB_Project/README.md b/diskann/07_Create_First_Cosmos_DB_Project/README.md new file mode 100644 index 0000000..ec6186a --- /dev/null +++ b/diskann/07_Create_First_Cosmos_DB_Project/README.md @@ -0,0 +1,94 @@ +# Create your first Azure Cosmos DB project + +This section will cover how to create your first Azure Cosmos DB project. We'll use a notebook to demonstrate the basic CRUD operations. We'll also cover using the Azure Cosmos DB Emulator to test code locally. + +## Emulator support + +Azure Cosmos DB has an emulator that can be used to develop code locally. The emulator supports the API for NoSQL and the API for MongoDB. The use of the emulator does not require an Azure subscription, nor does it incur any costs, so it is ideal for local development and testing. The Azure Cosmos DB emulator can also be utilized with unit tests in a [GitHub Actions CI workflow](https://learn.microsoft.com/azure/cosmos-db/how-to-develop-emulator?tabs=windows%2Cpython&pivots=api-nosql#use-the-emulator-in-a-github-actions-ci-workflow). + +There is not 100% feature parity between the emulator and the cloud service. Visit the [Azure Cosmos DB emulator](https://learn.microsoft.com/azure/cosmos-db/emulator) documentation for more details. + +For Windows machines, the emulator can be installed via an installer or by using a Docker container. A Docker image is also available for Linux-based machines. + +Learn more about the pre-requisites and installation of the emulator [here](https://learn.microsoft.com/azure/cosmos-db/how-to-develop-emulator?tabs=windows%2Cpython&pivots=api-nosql). + +**The Azure Cosmos DB emulator does not support vector search. To complete the vector search and AI-related labs, a Azure Cosmos DB for NoSQL account in Azure is required.** + +## Authentication + +Authentication to Azure Cosmos DB for NoSQL uses a connection string. The connection string is a URL that contains the authentication information for the Azure Cosmos DB account or local emulator. + +### Retrieving the connection string from the Azure Cosmos DB Emulator + +The splash screen or **Quickstart** section of the Azure Cosmos DB Emulator will display the connection string. Access this screen through the following URL: `https://localhost:8081/_explorer/index.html`. + +![The Azure Cosmos DB emulator screen displays with the local host url, the Quickstart tab, and the connection string highlighted.](media/emulator_connection_string.png) + +### Retrieving the connection string from the Azure portal + +Retrieve the connection string from the Azure portal by navigating to the Azure Cosmos DB account and expanding the **Settings** menu item on the left-hand side of the screen. Locate the **Primary Connection String**, and select the icon to make it visible. Copy the connection string and paste it in notepad for future reference. + +![The Azure Cosmos DB for NoSQL Connection strings screen displays with the copy button next to the connection string highlighted.](media/azure_connection_string.png) + +## Lab - Create your first Azure Cosmos DB for the NoSQL application + +Using a notebook, we'll create an Azure Cosmos DB for the NoSQL application in this lab using the **azure-cosmos** library and the Python language. Both the Azure Cosmos DB Emulator and Azure Cosmos DB account in Azure are supported for completion of this lab. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Please visit the lab repository to complete [this lab](../Labs/lab_1_first_application.ipynb). + +The following concepts are covered in detail in this lab: + +### Creating a database client + +The `azure-cosmos` library is used to create an Azure Cosmos DB for NoSQL database client. The client enables both DDL (data definition language) and DML (data manipulation language) operations. + +```python +# Initialize the Azure Cosmos DB client +client = CosmosClient.from_connection_string(CONNECTION_STRING) +``` + +### Creating a database + +The `create_database_if_not_exists` method is used to create a database. If the database already exists, the method will retrieve the existing database. + +```python +db: DatabaseProxy = client.create_database(database_name) +``` + +### Creating a container + +The `create_container_if_not_exists` method is used to create a container. If the container already exists, the method will retrieve the existing container. + +```python +container: ContainerProxy = db.create_container_if_not_exists( + id="product", + partition_key={"paths": ["/categoryId"], "kind": "Hash"} + ) +``` + +### Creating or Updating a document (Upsert) + +One method of creating a document is using the `create_item` method. This method takes a single document and inserts it into the database, if the item already exists in the container, and exception is thrown. Alternatively, the `upsert_item` method can also be used to insert a document into the database and in this case, if the document already exists, it will be updated. + +```python +# Create a document +container.upsert_item(product_dict) +``` + +### Reading documents + +The `read_item` method can be used to retrieve a single document if both the `id` value and `partition_key` value are known. Otherwise, the `query_items` method can be used to retrieve a list of documents using a [SQL-like query](https://learn.microsoft.com/azure/cosmos-db/nosql/tutorial-query). + +```python +items = container.query_items(query="SELECT * FROM prod", enable_cross_partition_query=True) +``` + +### Deleting a document + +The `delete_item` method is used to delete a document from the container. + +```python +container.delete_item(item=product.id, partition_key=product.category_id) +``` diff --git a/diskann/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png b/diskann/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png new file mode 100644 index 0000000..3948d61 Binary files /dev/null and b/diskann/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png differ diff --git a/diskann/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png b/diskann/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png new file mode 100644 index 0000000..e2ef454 Binary files /dev/null and b/diskann/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png differ diff --git a/diskann/08_Load_Data/README.md b/diskann/08_Load_Data/README.md new file mode 100644 index 0000000..04a976a --- /dev/null +++ b/diskann/08_Load_Data/README.md @@ -0,0 +1,11 @@ +# Load data into Azure Cosmos DB for NoSQL + +To support future labs and exercises, the Cosmic Works data must be loaded into Azure Cosmos DB for NoSQL containers. This lab will demonstrate how to load the Cosmic Works Customer, Product, and Sales data into Azure Cosmos DB for NoSQL containers. + +## Lab - Load data into Azure Cosmos DB for NoSQL containers + +This lab will load the Cosmic Works Customer, Product, and Sales data into Azure Cosmos DB for NoSQL containers using bulk operations. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Please visit the lab repository to complete [this lab](../Labs/lab_2_load_data.ipynb). diff --git a/diskann/09_Vector_Search_Cosmos_DB/README.md b/diskann/09_Vector_Search_Cosmos_DB/README.md new file mode 100644 index 0000000..ad7795d --- /dev/null +++ b/diskann/09_Vector_Search_Cosmos_DB/README.md @@ -0,0 +1,277 @@ +# Use vector search on embeddings in Azure Cosmos DB for NoSQL + +## Azure Cosmos DB indexing + +Azure Cosmos DB automatically indexes all properties for all items in a container. However, the creation of additional indexes can improve performance and add functionality such as spatial querying and vector search. + +The following indexes are supported by Azure Cosmos DB: + +The **Range Index** supports efficient execution of queries involving numerical and string data types. It is optimized for inequality comparisons (<, <=, >, >=) and sorting operations. Range indexes are particularly useful for time-series data, financial applications, and any scenario that requires filtering or sorting over a numeric range or alphabetically ordered strings. + +The **Spatial Index** excels with geospatial data types such as points, lines, and polygons. Spatial queries include operations such as finding intersections, conducting proximity searches, and handling bounding-box queries. Spatial indexes are crucial for applications that require geographic information system (GIS) capabilities, location-based services, and asset tracking. + +The **Composite Index** combines multiple properties into a single entry, optimizing complex queries that use multiple properties for filtering and sorting. They significantly improve the performance of multidimensional queries by reducing the number of request units (RUs) consumed during these operations. + +The **Vector Index** is specialized for high-dimensional vector data. Use cases include similarity searches, recommendation systems, and any other application requiring efficient handling of high-dimensional vectors. This index type optimizes the storage and retrieval of vectors typically utilized in AI application patterns such as RAG (Retrieval Augmented Generation). + +Learn more about indexing in the [Azure documentation](https://learn.microsoft.com/azure/cosmos-db/index-overview) + +## Embeddings and vector search + +Embedding is a way of serializing the semantic meaning of data into a vector representation. Because the generated vector embedding represents the semantic meaning, it means that when it is searched, it can find similar data based on the semantic meaning of the data rather than exact text. Data can come from many sources, including text, images, audio, and video. Because the data is represented as a vector, vector search can, therefore, find similar data across all different types of data. + +Embeddings are created by sending data to an embedding model, where it is transformed into a vector, which then can be stored as a vector field within its source document in Azure Cosmos DB for NoSQL. Azure Cosmos DB for NoSQL supports the creation of vector search indexes on top of these vector fields. A vector search index is a container of vectors in [latent space](https://idl.cs.washington.edu/papers/latent-space-cartography/) that enables a semantic similarity search across all data (vectors) contained within. + +![A typical embedding pipeline that demonstrates how source data is transformed into vectors using an embedding model then stored in a document in an Azure Cosmos DB container and exposed via a vector search index.](media/embedding_pipeline.png) + +## Why vector search? + +Vector search is an important RAG (Retrieval Augmented Generation) pattern component. Large Language Model (LLM) data is trained on a snapshot of public data at a point in time. This data does not contain recent public information, nor does it collect private, corporate information. LLMs are also very broad in their knowledge, and including information from a RAG process can help it focus accurately on a specific domain. + +A vector index search allows for a prompt pre-processing step where information can be semantically retrieved from an index and then used to generate a factually accurate prompt for the LLM to reason over. This provides the knowledge augmentation and focus (attention) to the LLM. + +In this example, assume textual data is vectorized and stored within an Azure Cosmos DB for NoSQL database. The text data and embeddings/vector field are stored in the same document. A vector search index has been created on the vector field. When a message is received from a chat application, this message is also vectorized using the same embedding model (ex., Azure OpenAI text-embedding-ada-002), which is then used as input to the vector search index. The vector search index returns a list of documents whose vector field is semantically similar to the incoming message. The unvectorized text stored within the same document is then used to augment the LLM prompt. The LLM receives the prompt and responds to the requestor based on the information it has been given. + +![A typical vector search request in a RAG scenario depicts an incoming message getting vectorized and used as input to a vector store index search. Multiple results of the vector search are used to build a prompt fed to the LLM. The LLM returns a response to the requestor.](media/vector_search_flow.png) + +## Why use Azure Cosmos DB for NoSQL as a vector store? + +It is common practice to store vectorized data in a dedicated vector store as vector search indexing is not a common capability of most databases. However, this introduces additional complexity to the solution as the data must be stored in two different locations. Azure Cosmos DB for NoSQL supports vector search indexing, which means that the vectorized data can be stored in the same document as the operational NoSQL data. This reduces the complexity of the solution and allows for a single database to be used for both the vector store and the operational NoSQL data. + +Azure Cosmos DB offers the ability to run serverless workloads, allowing for cost-effective, on-demand scaling for applications that don't require constant high performance. As your workloads grow, you can seamlessly transition to provisioned throughput, unlocking advanced capabilities such as low latency and high availability. This ability to scale both up and down means you can optimize for both performance and cost-efficiency, ensuring that vector search operations meet the demands of your application without compromising on response times or availability. + +## Vector index options in Azure Cosmos DB for NoSQL + +Vector indexes are built for enabling efficient similarity searches on vector data stored in Azure Cosmos DB for NoSQL. Azure Cosmos DB for NoSQL supports three types of vector indexes: flat, quantizedFlat, and diskANN. Each index type offers a different balance of accuracy, performance, and resource efficiency, allowing developers to choose the best index type based on their application requirements. + +### Flat Index + +The **flat index** in Azure Cosmos DB stores vectors alongside other indexed properties and performs searches using a brute-force method. This index guarantees 100% accuracy, meaning it always finds the most similar vectors within the dataset. However, it supports a maximum of 505 dimensions, which may limit its use in scenarios requiring high-dimensional vector searches. The **flat index** is ideal for applications where the highest accuracy is paramount, and the number of vector dimensions is relatively low. + +### Quantized Flat Index + +The **quantizedFlat** index offers an optimized approach by compressing vectors before storing them, trading a small amount of accuracy for improved performance. Supporting up to 4096 dimensions, this index type provides lower latency and higher throughput compared to the flat index, along with reduced RU consumption. Although the compression may result in slightly less than 100% accuracy, the **quantizedFlat** index is excellent for use cases where query filters narrow down the vector search set, and a balance between performance and high accuracy is desired. + +### DiskANN Index + +The **diskANN index** leverages advanced vector indexing algorithms developed by Microsoft Research to create an efficient approximate nearest neighbors (ANN) index. This index supports up to 4096 dimensions and offers some of the lowest latency, highest throughput, and lowest RU costs among index types. While the approximate nature of **diskANN** may lead to a slight reduction in accuracy compared to flat and quantizedFlat indexes, it excels in scenarios requiring high-performance, large-scale vector searches with a focus on speed and resource efficiency. + +>**Note**: The **quantizedFlat** and **diskANN** indexes requires that at least 1,000 vectors to be inserted. This is to ensure accuracy of the quantization process. If there are fewer than 1,000 vectors, a full scan is executed instead and will lead to higher RU charges for a vector search query. In this lab, DiskANN is used as the index of choice but keep in mind that it is not taking advantage of the full capabilities of the index. + +## Enabling vector search in Azure Cosmos DB for NoSQL + +The Azure Cosmos DB account that has been deployed with this guide is already enabled with the capability for vector search. For reference, the following Azure CLI command can be used to enable vector search on an existing Azure Cosmos DB account: + +```bash +az cosmosdb update --resource-group --name --capabilities EnableNoSQLVectorSearch +``` + +### The vector search profile and vector indexing policy + +To perform vector searches with Azure Cosmos DB for NoSQL, the creation and assignment of a vector policy and vector index policy for the container is required. This vector policy provides necessary information for the database engine to efficiently conduct similarity searches on vectors stored within the container's documents. Additionally, the vector policy informs the vector indexing policy if you choose to specify one. A vector policy includes the following information: + +- **Path**: Specifies the property that contains the vector. This field is required. +- **Datatype**: Defines the data type of the vector property. The default data type is `Float32`. +- **Dimensions**: Determines the dimensionality or length of each vector in the specified path. All vectors within the path must have the same number of dimensions, with the default being 1536. +- **Distance Function**: Describes the metric used to compute distance or similarity between vectors. Supported functions are: **cosine**, **dotproduct**, and **euclidean**. The default distance function is `cosine`. + +The following is an example of a vector policy. + +```json +{ + "vectorEmbeddings": [ + { + "path":"/vector1", + "dataType":"float32", + "distanceFunction":"cosine", + "dimensions":1536 + } + ] +} +``` + +The vector indexing policy identifies the vector index type to be used for the index. + +The following is an example of a vector indexing policy. + +```json +{ + "indexingMode": "consistent", + "automatic": true, + "includedPaths": [ + { + "path": "/*" + } + ], + "excludedPaths": [ + { + "path": "/_etag/?" + }, + { + "path": "/vector1/*" + } + ], + "vectorIndexes": [ + { + "path": "/vector1", + "type": "DiskANN" + } + ] +} +``` + +## Lab - Use vector search on embeddings in Azure Cosmos DB for NoSQL + +In this lab, a notebook demonstrates how to add an embedding field to a document, create a container capable of vector search, and perform a vector search query. The notebook ends with a demonstration of utilizing vector search with an LLM in a RAG scenario using Azure OpenAI. + +This lab requires the Azure OpenAI endpoint and access key to be added to the settings (`.env`) file. Access this information by opening [Azure OpenAI Studio](https://oai.azure.com/portal) and selecting the **Gear**/Settings icon located to the right in the top toolbar. + +![Azure OpenAI Studio displays with the Gear icon highlighted in the top toolbar.](media/azure_openai_studio_settings_icon.png) + +On the **Settings** screen, select the **Resource** tab, then copy and record the **Endpoint** and **Key** values for use in the lab. + +![The Azure OpenAI resource settings screen displays with the endpoint and key values highlighted.](media/azure_openai_settings.png) + +>**NOTE**: This lab can only be completed using a deployed Azure Cosmos DB for NoSQL account due to the use of vector search. The Azure Cosmos DB Emulator does not support vector search. + +This lab also requires the data provided in the previous lab titled [Load data into Azure Cosmos DB for NoSQL containers](../08_Load_Data/README.md#lab---load-data-into-azure-cosmos-db-api-for-nosql-containers). Run all cells in this notebook to prepare the data for use in this lab. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Please visit the lab repository to complete [this lab](../Labs/lab_3_cosmosdb_vector_search.ipynb). + +Some highlights from the lab include: + +### Instantiating an AzureOpenAI client + +```python +# Instantiate an AzureOpenAI client +ai_client = AzureOpenAI( + azure_endpoint = AOAI_ENDPOINT, + api_version = AOAI_API_VERSION, + api_key = AOAI_KEY + ) +``` + +### Vectorizing text using Azure OpenAI + +```python +# Generate embedding vectors from a text string +def generate_embeddings(text: str): + ''' + Generate embeddings from string of text using the deployed Azure OpenAI API embeddings model. + This will be used to vectorize document data and incoming user messages for a similarity search with the vector index. + ''' + response = ai_client.embeddings.create(input=text, model=EMBEDDINGS_DEPLOYMENT_NAME) + embeddings = response.data[0].embedding + time.sleep(0.5) # rest period to avoid rate limiting on AOAI for free tier + return embeddings +``` + +### Adding a vector embedding profile and vector indexing profile to a container + +The lab creates an embedding field named `contentVector` in each container and populates the value with the vectorized text of the JSON representation of the document. Currently vector search in Azure Cosmos DB for NoSQL is supported on new containers only. To configure both the container vector policy and any vector indexing policy needs to be done at the time of container creation as it can’t be modified later. Both policies will be modifiable in a future improvement to the preview feature. + +```python +# Create the vector embedding policy +vector_embedding_policy = { + "vectorEmbeddings": [ + { + "path": "/contentVector", + "dataType": "float32", + "distanceFunction": "cosine", + "dimensions": 1536 + } + ] +} + +# Create the indexing policy +indexing_policy = { + "indexingMode": "consistent", + "automatic": True, + "includedPaths": [ + { + "path": "/*" + } + ], + "excludedPaths": [ + { + "path": "/\"_etag\"/?" + }, + { + "path": "/contentVector/*" + } + ], + "vectorIndexes": [ + { + "path": "/contentVector", + "type": "diskANN" + } + ] +} + +product_v_container = db.create_container_if_not_exists( + id="product_v", + partition_key=PartitionKey(path="/categoryId"), + indexing_policy=indexing_policy, + vector_embedding_policy=vector_embedding_policy +) +``` + +### Performing a vector search query + +```python +def vector_search( + container: ContainerProxy, + prompt: str, + vector_field_name:str="contentVector", + num_results:int=5): + query_embedding = generate_embeddings(prompt) + items = container.query_items( + query=f"""SELECT TOP @num_results itm.id, VectorDistance(itm.{vector_field_name}, @embedding) AS SimilarityScore + FROM itm + ORDER BY VectorDistance(itm.{vector_field_name}, @embedding) + """, + parameters = [ + { "name": "@num_results", "value": num_results }, + { "name": "@embedding", "value": query_embedding } + ], + enable_cross_partition_query=True + ) + return items +``` + +### Using vector search results with an LLM in a RAG scenario + +```python +def rag_with_vector_search( + container: ContainerProxy, + prompt: str, + vector_field_name:str="contentVector", + num_results:int=5): + """ + Use the RAG model to generate a prompt using vector search results based on the + incoming question. + """ + # perform the vector search and build product list + results = vector_search(container, prompt, vector_field_name, num_results) + product_list = "" + for result in results: + # retrieve the product details + product = query_item_by_id(container, result["id"], Product) + # remove the contentVector field from the product details, this isn't needed for the context + product.content_vector = None + product_list += json.dumps(product, indent=4, default=str) + "\n\n" + + # generate prompt for the LLM with vector results + formatted_prompt = system_prompt + product_list + + # prepare the LLM request + messages = [ + {"role": "system", "content": formatted_prompt}, + {"role": "user", "content": prompt} + ] + + completion = ai_client.chat.completions.create(messages=messages, model=COMPLETIONS_DEPLOYMENT_NAME) + return completion.choices[0].message.content +``` diff --git a/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png b/diskann/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png similarity index 100% rename from 09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png rename to diskann/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png diff --git a/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png b/diskann/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png similarity index 100% rename from 09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png rename to diskann/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png diff --git a/diskann/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png b/diskann/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png new file mode 100644 index 0000000..7884209 Binary files /dev/null and b/diskann/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png differ diff --git a/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png b/diskann/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png similarity index 100% rename from 09_Vector_Search_Cosmos_DB/media/vector_search_flow.png rename to diskann/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png diff --git a/diskann/10_LangChain/README.md b/diskann/10_LangChain/README.md new file mode 100644 index 0000000..5553bf5 --- /dev/null +++ b/diskann/10_LangChain/README.md @@ -0,0 +1,212 @@ +# LangChain + +[LangChain](https://www.langchain.com/) is an open-source framework designed to simplify the creation of applications that use large language models (LLMs). LangChain has a vibrant community of developers and contributors and is used by many companies and organizations. LangChain utilizes proven Prompt Engineering patterns and techniques to optimize LLMs, ensuring successful and accurate results through verified and tested best practices. + +Part of the appeal of LangChain syntax is the capability of breaking down large complex interactions with LLMs into smaller, more manageable steps by composing a reusable chain process. LangChain provides a syntax for chains([LCEL](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel)), the ability to integrate with external systems through [tools](https://python.langchain.com/docs/concepts/#tools), and end-to-end [agents](https://python.langchain.com/docs/concepts/#agents) for common applications. + +The concept of an agent is quite similar to that of a chain in LangChain but with one fundamental difference. A chain in LangChain is a hard-coded sequence of steps executed in a specific order. Conversely, an agent leverages the LLM to assess the incoming request with the current context to decide what steps or actions need to be executed and in what order. + +LangChain agents can leverage tools and toolkits. A tool can be an integration into an external system, custom code, a retriever, or even another chain. A toolkit is a collection of tools that can be used to solve a specific problem. + +## LangChain RAG pattern + +Earlier in this guide, the RAG (Retrieval Augmented Generation) pattern was introduced. In LangChain, the RAG pattern is implemented as part of a chain that combines a retriever and a Large Language Model (generator). The retriever is responsible for finding the most relevant documents for a given query, in this case, doing a vector search on Azure Cosmos DB for NoSQL, and the LLM (generator) is responsible for reasoning over the incoming prompt and context. + +![LangChain RAG diagram shows the flow of an incoming message through a retriever, augmenting the prompt, parsing the output and returning the final message.](media/langchain_rag.png) + +When an incoming message is received, the retriever will vectorize the message and perform a vector search to find the most relevant documents for the given query. The retriever returns a list of documents that are then used to augment the prompt. The augmented prompt is then passed to the LLM (generator) to reason over the prompt and context. The output from the LLM is then parsed and returned as the final message. + +> **Note**: A vector store retriever is only one type of retriever that can be used in the RAG pattern. Learn more about retrievers in the [LangChain documentation](https://python.langchain.com/docs/concepts/#retrievers). + +## Lab - Vector search and RAG using LangChain + +In this lab uses LangChain to re-implement the RAG pattern introduced in the previous lab. Take note of the readability of the code and how easy it is to compose a reusable RAG chain using LangChain that queries the products vector index in Azure Cosmos DB for NoSQL. The lab concludes with the creation of an agent with various tools for the LLM to leverage to fulfill the incoming request. + +This lab also requires the data provided in the previous lab titled [Load data into Azure Cosmos DB for NoSQL containers](../08_Load_Data/README.md#lab---load-data-into-azure-cosmos-db-api-for-nosql-containers) as well as the populated vector index created in the lab titled [Vector Search using Azure Cosmos DB for NoSQL](../09_Vector_Search_Cosmos_DB/README.md#lab---use-vector-search-on-embeddings-in-azure-cosmos-db-for-nosql). Run all cells in both notebooks to prepare the data for use in this lab. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Please visit the lab repository to complete [this lab](../Labs/lab_4_langchain.ipynb). + +Some highlights of the lab include: + +### Creating a custom LangChain retriever for Azure Cosmos DB for NoSQL + +```python +class AzureCosmosDBNoSQLRetriever(BaseRetriever): + """ + A custom LangChain retriever that uses Azure Cosmos DB NoSQL database for vector search. + """ + embedding_model: AzureOpenAIEmbeddings + container: ContainerProxy + model: Type[T] + vector_field_name: str + num_results: int=5 + + def __get_embeddings(self, text: str) -> List[float]: + """ + Returns embeddings vector for a given text. + """ + embedding = embedding_model.embed_query(text) + time.sleep(0.5) # rest period to avoid rate limiting on AOAI + return embedding + + def __get_item_by_id(self, id) -> T: + """ + Retrieves a single item from the Azure Cosmos DB NoSQL database by its ID. + """ + query = "SELECT * FROM itm WHERE itm.id = @id" + parameters = [ + {"name": "@id", "value": id} + ] + item = list(self.container.query_items( + query=query, + parameters=parameters, + enable_cross_partition_query=True + ))[0] + return self.model(**item) + + def __delete_attribute_by_alias(self, instance: BaseModel, alias): + for model_field in instance.model_fields: + field = instance.model_fields[model_field] + if field.alias == alias: + delattr(instance, model_field) + return + + def _get_relevant_documents( + self, query: str, *, run_manager: CallbackManagerForRetrieverRun + ) -> List[Document]: + """ + Performs a synchronous vector search on the Azure Cosmos DB NoSQL database. + """ + embedding = self.__get_embeddings(query) + items = self.container.query_items( + query=f"""SELECT TOP @num_results itm.id, VectorDistance(itm.{self.vector_field_name}, @embedding) AS SimilarityScore + FROM itm + ORDER BY VectorDistance(itm.{self.vector_field_name}, @embedding) + """, + parameters = [ + { "name": "@num_results", "value": self.num_results }, + { "name": "@embedding", "value": embedding } + ], + enable_cross_partition_query=True + ) + returned_docs = [] + for item in items: + itm = self.__get_item_by_id(item["id"]) + # Remove the vector field from the returned item so it doesn't fill the context window + self.__delete_attribute_by_alias(itm, self.vector_field_name) + returned_docs.append(Document(page_content=json.dumps(itm, indent=4, default=str), metadata={"similarity_score": item["SimilarityScore"]})) + return returned_docs + + async def _aget_relevant_documents( + self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun + ) -> List[Document]: + """ + Performs an asynchronous vector search on the Azure Cosmos DB NoSQL database. + """ + raise Exception(f"Asynchronous search not implemented.") +``` + +### Composing a reusable RAG chain + +```python +# Create an instance of the AzureCosmosDBNoSQLRetriever +products_retriever = AzureCosmosDBNoSQLRetriever( + embedding_model = embedding_model, + container = product_v_container, + model = Product, + vector_field_name = "contentVector", + num_results = 5 +) + +# Create the prompt template from the system_prompt text +llm_prompt = PromptTemplate.from_template(system_prompt) + +rag_chain = ( + # populate the tokens/placeholders in the llm_prompt + # question is a passthrough that takes the incoming question + { "products": products_retriever, "question": RunnablePassthrough()} + | llm_prompt + # pass the populated prompt to the language model + | llm + # return the string ouptut from the language model + | StrOutputParser() +) +``` + +### Creating tools for LangChain agents to use + +Tools are selected by the Large Language model at runtime. In this case, depending on the incoming user request the LLM will decide which container in the database to query. The following code shows how to create a tool for the LLM to use to query the products collection in the database. + +```python +# Create a tool that will use the product vector search in Azure Cosmos DB for NoSQL +products_retriever_tool = create_retriever_tool( + retriever = products_retriever, + name = "vector_search_products", + description = "Searches Cosmic Works product information for similar products based on the question. Returns the product information in JSON format." +) +tools = [products_retriever_tool] +``` + +### Creating tools that call Python functions + +Users may query for information that does not have a semantic meaning, such as an ID GUID value or a SKU number. Providing agents with tools to call Python functions to retrieve documents based on these fields is a common practice. The following is an example of adding tools that call out to Python functions for the products collection. + +```python +def get_product_by_id(product_id: str) -> str: + """ + Retrieves a product by its ID. + """ + item = get_single_item_by_field_name(product_v_container, "id", product_id, Product) + delete_attribute_by_alias(item, "contentVector") + return json.dumps(item, indent=4, default=str) + +def get_product_by_sku(sku: str) -> str: + """ + Retrieves a product by its sku. + """ + item = get_single_item_by_field_name(product_v_container, "sku", sku, Product) + delete_attribute_by_alias(item, "contentVector") + return json.dumps(item, indent=4, default=str) + +def get_sales_by_id(sales_id: str) -> str: + """ + Retrieves a sales order by its ID. + """ + item = get_single_item_by_field_name(sales_order_container, "id", sales_id, SalesOrder) + return json.dumps(item, indent=4, default=str) + +tools.extend([ + StructuredTool.from_function(get_product_by_id), + StructuredTool.from_function(get_product_by_sku), + StructuredTool.from_function(get_sales_by_id) +]) +``` + +### Creating an agent armed with tools for vector search and Python functions calling + +```python +agent_instructions = """ + Your name is "Willie". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy. + Do not include citations or citation numbers in your responses. Do not include emojis. + You are designed to answer questions about the products that Cosmic Works sells, the customers that buy them, and the sales orders that are placed by customers. + If you don't know the answer to a question, respond with "I don't know." + Only answer questions related to Cosmic Works products, customers, and sales orders. + If a question is not related to Cosmic Works products, customers, or sales orders, + respond with "I only answer questions about Cosmic Works" + """ + +from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder + +prompt = ChatPromptTemplate.from_messages( + [ + ("system", agent_instructions), + MessagesPlaceholder("chat_history", optional=True), + ("human", "{input}"), + MessagesPlaceholder("agent_scratchpad"), + ] +) +agent = create_openai_functions_agent(llm, tools, prompt) +agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True) +``` diff --git a/10_LangChain/media/langchain_rag.png b/diskann/10_LangChain/media/langchain_rag.png similarity index 100% rename from 10_LangChain/media/langchain_rag.png rename to diskann/10_LangChain/media/langchain_rag.png diff --git a/diskann/11_Backend_API/README.md b/diskann/11_Backend_API/README.md new file mode 100644 index 0000000..2f4fbd5 --- /dev/null +++ b/diskann/11_Backend_API/README.md @@ -0,0 +1,11 @@ +# Lab - Backend API + +In the previous lab, a LangChain agent was created armed with tools to do vector lookups and concrete document id lookups via function calling. In this lab, the agent functionality needs to be extracted into a backend api for the frontend application that will allow users to interact with the agent. + +This lab implements a backend API using FastAPI that exposes the LangChain agent functionality. The provided code leverages Docker containers and includes full step-by-step instructions to run and test the API locally as well as deployed to [Azure Container Apps](https://learn.microsoft.com/azure/container-apps/overview) (leveraging the [Azure Container Registry](https://learn.microsoft.com/azure/container-registry/)). + +This lab also requires the data provided in the previous lab titled [Load data into Azure Cosmos DB API for NoSQL containers](../08_Load_Data/README.md#lab---load-data-into-azure-cosmos-db-for-nosql-containers) as well as the populated vector index created in the lab titled [Vector Search using Azure Cosmos DB for NoSQL](../09_Vector_Search_Cosmos_DB/README.md#lab---use-vector-search-on-embeddings-in-azure-cosmos-db-for-nosql). Run all cells in both notebooks to prepare the data for use in this lab. + +>**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. + +Please visit the lab repository to complete [this lab](../Labs/lab_5_backend_api.md). diff --git a/11_Backend_API/media/2024-01-06-20-01-38.png b/diskann/11_Backend_API/media/2024-01-06-20-01-38.png similarity index 100% rename from 11_Backend_API/media/2024-01-06-20-01-38.png rename to diskann/11_Backend_API/media/2024-01-06-20-01-38.png diff --git a/11_Backend_API/media/acr_access_keys.png b/diskann/11_Backend_API/media/acr_access_keys.png similarity index 100% rename from 11_Backend_API/media/acr_access_keys.png rename to diskann/11_Backend_API/media/acr_access_keys.png diff --git a/11_Backend_API/media/container_app_delete_hello_world.png b/diskann/11_Backend_API/media/container_app_delete_hello_world.png similarity index 100% rename from 11_Backend_API/media/container_app_delete_hello_world.png rename to diskann/11_Backend_API/media/container_app_delete_hello_world.png diff --git a/11_Backend_API/media/container_app_edit_and_deploy.png b/diskann/11_Backend_API/media/container_app_edit_and_deploy.png similarity index 100% rename from 11_Backend_API/media/container_app_edit_and_deploy.png rename to diskann/11_Backend_API/media/container_app_edit_and_deploy.png diff --git a/11_Backend_API/media/container_app_failed_revision.png b/diskann/11_Backend_API/media/container_app_failed_revision.png similarity index 100% rename from 11_Backend_API/media/container_app_failed_revision.png rename to diskann/11_Backend_API/media/container_app_failed_revision.png diff --git a/11_Backend_API/media/container_app_log_stream.png b/diskann/11_Backend_API/media/container_app_log_stream.png similarity index 100% rename from 11_Backend_API/media/container_app_log_stream.png rename to diskann/11_Backend_API/media/container_app_log_stream.png diff --git a/11_Backend_API/media/container_app_overview.png b/diskann/11_Backend_API/media/container_app_overview.png similarity index 100% rename from 11_Backend_API/media/container_app_overview.png rename to diskann/11_Backend_API/media/container_app_overview.png diff --git a/11_Backend_API/media/container_app_ready.png b/diskann/11_Backend_API/media/container_app_ready.png similarity index 100% rename from 11_Backend_API/media/container_app_ready.png rename to diskann/11_Backend_API/media/container_app_ready.png diff --git a/11_Backend_API/media/container_deploy.png b/diskann/11_Backend_API/media/container_deploy.png similarity index 100% rename from 11_Backend_API/media/container_deploy.png rename to diskann/11_Backend_API/media/container_deploy.png diff --git a/11_Backend_API/media/local_backend_docker_build.png b/diskann/11_Backend_API/media/local_backend_docker_build.png similarity index 100% rename from 11_Backend_API/media/local_backend_docker_build.png rename to diskann/11_Backend_API/media/local_backend_docker_build.png diff --git a/11_Backend_API/media/local_backend_docker_push.png b/diskann/11_Backend_API/media/local_backend_docker_push.png similarity index 100% rename from 11_Backend_API/media/local_backend_docker_push.png rename to diskann/11_Backend_API/media/local_backend_docker_push.png diff --git a/11_Backend_API/media/local_backend_docker_run.png b/diskann/11_Backend_API/media/local_backend_docker_run.png similarity index 100% rename from 11_Backend_API/media/local_backend_docker_run.png rename to diskann/11_Backend_API/media/local_backend_docker_run.png diff --git a/11_Backend_API/media/local_backend_running_console.png b/diskann/11_Backend_API/media/local_backend_running_console.png similarity index 100% rename from 11_Backend_API/media/local_backend_running_console.png rename to diskann/11_Backend_API/media/local_backend_running_console.png diff --git a/11_Backend_API/media/local_backend_swagger_ui.png b/diskann/11_Backend_API/media/local_backend_swagger_ui.png similarity index 100% rename from 11_Backend_API/media/local_backend_swagger_ui.png rename to diskann/11_Backend_API/media/local_backend_swagger_ui.png diff --git a/11_Backend_API/media/local_backend_swagger_ui_ai_response.png b/diskann/11_Backend_API/media/local_backend_swagger_ui_ai_response.png similarity index 100% rename from 11_Backend_API/media/local_backend_swagger_ui_ai_response.png rename to diskann/11_Backend_API/media/local_backend_swagger_ui_ai_response.png diff --git a/11_Backend_API/media/local_backend_swagger_ui_root_response.png b/diskann/11_Backend_API/media/local_backend_swagger_ui_root_response.png similarity index 100% rename from 11_Backend_API/media/local_backend_swagger_ui_root_response.png rename to diskann/11_Backend_API/media/local_backend_swagger_ui_root_response.png diff --git a/diskann/12_User_Interface/README.md b/diskann/12_User_Interface/README.md new file mode 100644 index 0000000..20fc84e --- /dev/null +++ b/diskann/12_User_Interface/README.md @@ -0,0 +1,87 @@ +# Connect the chat user interface with the chatbot API + +In the previous lab, the backend API code was configured and deployed. The backend API integrates Azure Cosmos DB for NoSQL with Azure OpenAI. When the Azure resource template for this lab was run to deploy the necessary Azure resources, a front-end web application written as a SPA (single page application) in React was deployed. + +The URL to access this front-end application within the Azure Portal on the **Web App** resource with the name that ends with **-web**. + +The following screenshot shows where to find the front-end application URL: + +![Web App resource for front-end application with Default domain highlighted](images/2024-09-03-12-13-34.png) + +Navigating to this URL in the browser accesses the front-end application. Through this front-end application User Interface, questions can be submitted to the Azure OpenAI model about the CosmicWorks company data, then it will generate responses accordingly. + +![Front-end Web Application User Interface](images/2024-10-15-11-48-17.png) + +While the code for the SPA web application is outside the scope of this dev guide. It's worth noting that the Web App is configured with the URL for the Backend API using **App settings** with the variable named `API_ENDPOINT` under the **Settings** -> **Environment variables** pane. When the application was deployed as part of the Azure template deployment, it was automatically configured with this URL to connect the front-end SPA web application to the Backend API. + +![Web App resource showing the application settings with the API_ENDPOINT setting highlighted](images/2024-09-03-12-15-57.png) + +## Ask questions about data and observe the responses + +To ask the AI questions about the CosmicWorks company data, type the questions in to the front-end application chat user interface. The web application includes tiles with a couple example questions to get started. To use these, simply click on the question tile and it will generate an answer. + +![Front-end Web Application User Interface](images/2024-10-15-11-48-17.png) + +These example questions are: +- What was the price of the product with sku `FR-R92B-58`? +- What is the SKU of HL Road Frame - Black? +- What is HL Road Frame? + +> **Note**: It's possible the first time you ask a question within the Front end application there may be an error. Occasionally when the Azure Bicep template deploys the front end application there will be an issue configuring the use of the `API_ENDPOINT` app setting. If this happens, simply navigate to **Deployment** -> **Deployment Center**, then click **Sync** to have the Web App refresh the deployment of the front end app from it's GitHub repository source code. This should fix that error. + +The chat user interface presents as a traditional chat application style interface when asking questions. It also includes a "New chat" button to open new chat sessions, and the list of previous chat sessions on the left side of the UI that enables you to toggle between sessions. + +![Chat user interface screenshot with question and generated answer displayed](images/2024-10-15-11-49-18.png) + +Go ahead, ask the service a few questions about CosmicWorks and observe the responses. + +## What do I do if the responses are incorrect? + +It's important to remember the model is pre-trained with data, given a system message to guide it, in addition to the company data it has access to via Azure Cosmos DB for NoSQL. There are times when the Azure OpenAI model may generate an incorrect response to the prompt given that is either incomplete or even a hallucination (aka includes information that is not correct or accurate). + +There are a few options of how this can be handled when the response is incorrect: + +1. Provide a new prompt that includes more specific and structured information that can help the model generate a more accurate response. +2. Include more data in the library of company information the model has access to. The incorrect response may be a result of data or information about the company that is missing currently. +3. Use Prompt Engineering techniques to enhance the System message and/or Supporting information provided to guide the model. + +While it may be simple to ask the model questions, there are times when Prompt Engineering skills may be necessary to get the most value and reliable responses from the AI model. + +## What happens when I start exceeding my token limits? + +A Token in Azure OpenAI is a basic unit of input and output that the service processes. Generally, the models understand and process text by breaking it down into tokens. + +For example, the word `hamburger` gets broken up into the tokens `ham`, `bur` and `ger`, while a short and common word like `pear` is a single token. Many tokens start with a whitespace, for example ` hello` and ` bye`. + +The total number of tokens processed in a given request depends on the length of the input, output and request parameters. The quantity of tokens being processed will also affect the response latency and throughput for the models. + +> **Note**: The [pricing of the Azure OpenAI](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) service is primarily based on token usage. + +### Exceeding Token Quota Limits + +Azure OpenAI has **tokens per minute** quota limits on the service. This quota limit, is based on the OpenAI model being used and the Azure region it's hosted in. + +> **Note**: The [Azure OpenAI Quotas and limits documentation](https://learn.microsoft.com/azure/ai-services/openai/quotas-limits) contains further information on the specific quotas per OpenAI model and Azure region. + +If an applications usage of an Azure OpenAI model exceeds the token quota limits, then the service will respond with a **Rate Limit Error** (Error code 429). + +When this error is encountered, there are a couple options available for handling it: + +- **Wait a minute** - With the tokens quota limit being a rate limits of the maximum number of tokens allowed per minute, the application will be able to send more prompts to the model again after the quota resets each minute. +- **Request a quota increase** - It may be possible to get Microsoft to increase the token quota to a higher limit, but it's not guaranteed to be approved. This request can be made at [https://aka.ms/oai/quotaincrease](https://aka.ms/oai/quotaincrease) + +### Tips to Minimize Token Rate Limit Errors + +Here are a few tips that can help to minimize an applications token rate limit errors: + +- **Retry Logic** - Implement retry logic in the application so it will retry the call to the Azure OpenAI model, rather than throwing an exception the first time. This is generally best practices when consuming external APIs from applications so they can gracefully handle any unexpected exceptions. +- **Scale Workload Gradually** - Avoid increasing the workload of the application too quickly. By gradually increasing the scale of the workload. +- **Asynchronous Load Patterns** - While there are time sensitive operations that will require a response immediately, there are also operations that are able to be run more asynchronously. Background processes or other similar operations could be build in a way to perform a combination of rate limiting the applications own usage of the model, or even delaying calls until periods of the day where the application is under less load. +- **Set `max_tokens`** - Setting a lower `max_tokens` when calling the service when a short response is expected will limit the maximum number of tokens allowed for the generated answer. +- **Set `best_of`** - Setting a lower `best_of` when calling the service enables the application to control the number of candidate completions generated and how many to return from the service. + +### Exceeding Token Limit for System message + +When configuring a System message to guide the generated responses, there is a limit on how long the System message can be. The token limit for the System message is 400 tokens. + +If the System message provided is more than 400 tokens, the rest of the tokens beyond the first 400 will be ignored. diff --git a/diskann/12_User_Interface/images/2024-09-03-12-13-34.png b/diskann/12_User_Interface/images/2024-09-03-12-13-34.png new file mode 100644 index 0000000..eaf6266 Binary files /dev/null and b/diskann/12_User_Interface/images/2024-09-03-12-13-34.png differ diff --git a/diskann/12_User_Interface/images/2024-09-03-12-15-57.png b/diskann/12_User_Interface/images/2024-09-03-12-15-57.png new file mode 100644 index 0000000..499a727 Binary files /dev/null and b/diskann/12_User_Interface/images/2024-09-03-12-15-57.png differ diff --git a/diskann/12_User_Interface/images/2024-10-15-11-48-17.png b/diskann/12_User_Interface/images/2024-10-15-11-48-17.png new file mode 100644 index 0000000..7c9d5d6 Binary files /dev/null and b/diskann/12_User_Interface/images/2024-10-15-11-48-17.png differ diff --git a/diskann/12_User_Interface/images/2024-10-15-11-49-18.png b/diskann/12_User_Interface/images/2024-10-15-11-49-18.png new file mode 100644 index 0000000..85e01d3 Binary files /dev/null and b/diskann/12_User_Interface/images/2024-10-15-11-49-18.png differ diff --git a/diskann/13_Conclusion/README.md b/diskann/13_Conclusion/README.md new file mode 100644 index 0000000..943dc16 --- /dev/null +++ b/diskann/13_Conclusion/README.md @@ -0,0 +1,26 @@ +# Conclusion + +This guide has provided a comprehensive walkthrough for creating intelligent solutions that combine Azure Cosmos DB for NoSQL with vector search capabilities powered by DiskANN and document retrieval with Azure OpenAI services to build a chat bot experience. By integrating these technologies, you can efficiently manage both operational data and vectors within a single database, while leveraging Azure OpenAI for advanced document retrieval and natural language understanding. + +The benefits of building a chat bot experience using Azure Cosmos DB for NoSQL with vector search capabilities powered by DiskANN and Azure OpenAI services includes: + +- **Unified data and vector management**: Storing both operational data and vectors together in a single database reduces complexity, improves performance, and eliminates the need to synchronize data between multiple databases. +- **No need for synchronization**: By keeping data and vectors in one place, you avoid the overhead of synchronizing two different databases. +- **Flexible schema**: Adapt to changing data structures effortlessly, ensuring your system remains flexible and scalable as your application evolves. +- **Support for latency-sensitive applications**: Azure Cosmos DB is optimized for applications requiring low-latency responses, making it suitable for real-time, interactive use cases. +- **High elasticity and throughput**: Azure Cosmos DB can scale seamlessly to handle high-throughput workloads, making it perfect for applications that need to grow dynamically with demand. +- **Store chat history and vector data**: Easily manage chat histories alongside vector and operational data, making it ideal for chat bot and other interactive applications. + +This guide was designed to provide an insightful journey for Python developers to get started with Azure Cosmos DB for NoSQL as it applies to creating exciting AI-enabled applications using existing skills. We hope you found this guide helpful and informative. + +## Clean up + +To clean up the resources created in this guide, delete the `cosmos-devguide-rg` resource group in the Azure Portal. + +Alternatively, you can use the Azure CLI to delete the resource group. The following command deletes the resource group and all resources within it. The `--no-wait` flag makes the command return immediately, without waiting for the deletion to complete. + +>**Note**: Ensure the Azure CLI session is authenticated using `az login` and the correct subscription is selected using `az account set --subscription `. + +```powershell +az group delete --name cosmos-devguide-rg --yes --no-wait +``` diff --git a/diskann/Backend/.env.EXAMPLE b/diskann/Backend/.env.EXAMPLE new file mode 100644 index 0000000..ebcbe73 --- /dev/null +++ b/diskann/Backend/.env.EXAMPLE @@ -0,0 +1,3 @@ +COSMOS_DB_CONNECTION_STRING="AccountEndpoint=https://.documents.azure.com:443/;AccountKey=;" +AOAI_ENDPOINT = "https://.openai.azure.com/" +AOAI_KEY = "" \ No newline at end of file diff --git a/diskann/Backend/.gitignore b/diskann/Backend/.gitignore new file mode 100644 index 0000000..b0ce424 --- /dev/null +++ b/diskann/Backend/.gitignore @@ -0,0 +1,5 @@ +.venv +__pycache__ +.env + +.DS_Store \ No newline at end of file diff --git a/Backend/DOCKERFILE b/diskann/Backend/DOCKERFILE similarity index 100% rename from Backend/DOCKERFILE rename to diskann/Backend/DOCKERFILE diff --git a/Backend/README.md b/diskann/Backend/README.md similarity index 100% rename from Backend/README.md rename to diskann/Backend/README.md diff --git a/Backend/api_models/ai_request.py b/diskann/Backend/api_models/ai_request.py similarity index 100% rename from Backend/api_models/ai_request.py rename to diskann/Backend/api_models/ai_request.py diff --git a/diskann/Backend/api_models/chat_session.py b/diskann/Backend/api_models/chat_session.py new file mode 100644 index 0000000..bd7ae8c --- /dev/null +++ b/diskann/Backend/api_models/chat_session.py @@ -0,0 +1,7 @@ +from pydantic import BaseModel, Field +from typing import List + +class ChatSession(BaseModel): + id: str # The session ID + title: str # The title of the chat session + history: List[dict] = Field(default_factory=list) # The chat history diff --git a/diskann/Backend/api_models/chat_session_request.py b/diskann/Backend/api_models/chat_session_request.py new file mode 100644 index 0000000..f66eb7a --- /dev/null +++ b/diskann/Backend/api_models/chat_session_request.py @@ -0,0 +1,6 @@ +from pydantic import BaseModel + +# Define the model for a Chat Session response +class ChatSessionResponse(BaseModel): + session_id: str + title: str diff --git a/diskann/Backend/app.py b/diskann/Backend/app.py new file mode 100644 index 0000000..283b6ee --- /dev/null +++ b/diskann/Backend/app.py @@ -0,0 +1,94 @@ +""" +API entrypoint for backend API. +""" +from fastapi import FastAPI, HTTPException +from fastapi.middleware.cors import CORSMiddleware + +from typing import List +from chat_session_state.cosmosdb_chat_session_state_provider import CosmosDBChatSessionStateProvider +from api_models.chat_session_request import ChatSessionResponse + +import uuid + +from api_models.ai_request import AIRequest +from cosmic_works.cosmic_works_ai_agent import CosmicWorksAIAgent + +app = FastAPI() + +origins = [ + "*" +] + +app.add_middleware( + CORSMiddleware, + allow_origins=origins, + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], +) + + +# Agent pool keyed by session_id to retain memories/history in-memory. +# Note: the context is lost every time the service is restarted. +agent_pool = {} + +@app.get("/") +def root(): + """ + Health probe endpoint. + """ + return {"status": "ready"} + +@app.post("/ai") +def run_cosmic_works_ai_agent(request: AIRequest): + """ + Run the Cosmic Works AI agent. + """ + prompt = request.prompt + session_id = request.session_id + + # If no session_id is provided or default is provided, generate a new one. + if (session_id is None or session_id == "1234"): + session_id = str(uuid.uuid4()) + + # If the session_id is not in the agent pool, create a new agent. + if session_id not in agent_pool: + agent_pool[session_id] = CosmicWorksAIAgent(session_id) + + # Run the agent with the provided prompt. + return { "message": agent_pool[session_id].run(prompt), "session_id": session_id } + + +# ======================== +# Chat Session State / History Support is below: +# ======================== + +# Create an instance of the CosmosDBChatSessionStateProvider class +# This will be used to load or create Chat Sessions +chat_session_state_provider = CosmosDBChatSessionStateProvider() + +@app.get("/session/list", response_model=List[ChatSessionResponse]) +def list_sessions(): + """ + Endpoint to list all chat sessions. + """ + try: + return chat_session_state_provider.list_sessions() + except RuntimeError as e: + # Return an internal server error if a runtime error occurs + raise HTTPException(status_code=500, detail=str(e)) + + +@app.get("/session/load/{session_id}") +def load_session(session_id: str): + """ + Endpoint to load a chat session by session_id. + """ + try: + return chat_session_state_provider.load_session(session_id) + except ValueError as e: + # Return a 404 error if the session is not found + raise HTTPException(status_code=404, detail=str(e)) + except RuntimeError as e: + # Return an internal server error if a runtime error occurs + raise HTTPException(status_code=500, detail=str(e)) diff --git a/diskann/Backend/chat_session_state/cosmosdb_chat_session_state_provider.py b/diskann/Backend/chat_session_state/cosmosdb_chat_session_state_provider.py new file mode 100644 index 0000000..988f8dd --- /dev/null +++ b/diskann/Backend/chat_session_state/cosmosdb_chat_session_state_provider.py @@ -0,0 +1,126 @@ +import os +from datetime import datetime +from typing import List, Optional +from azure.cosmos import CosmosClient, PartitionKey, exceptions as cosmos_exceptions +from dotenv import load_dotenv + +from api_models.chat_session_request import ChatSessionResponse +from api_models.chat_session import ChatSession + +# Load environment variables +load_dotenv() + +# Initialize Cosmos DB client and container globally within the module +CONNECTION_STRING = os.environ.get("COSMOS_DB_CONNECTION_STRING") +client = CosmosClient.from_connection_string(CONNECTION_STRING) +db = client.get_database_client("cosmic_works_pv") + +# Initialize the chat session container, create if not exists +db.create_container_if_not_exists(id="chat_session", partition_key=PartitionKey(path="/id")) + +chat_session_container = db.get_container_client("chat_session") + + +class CosmosDBChatSessionStateProvider: + """ + A class to encapsulate CRUD operations for interacting with the chat session state in Cosmos DB. + """ + + def __init__(self, container=chat_session_container): + self.container = container + + def list_sessions(self) -> List[ChatSessionResponse]: + """ + Lists all chat sessions from the chat session container. + + Returns: + List[ChatSessionResponse]: A list of chat session responses. + """ + try: + query = "SELECT c.id, c.title FROM c" + sessions = list(self.container.query_items( + query=query, + enable_cross_partition_query=True + )) + + # Convert the sessions into a list of ChatSessionResponse objects + session_responses = [ + ChatSessionResponse(session_id=session['id'], title=session['title']) + for session in sessions + ] + return session_responses + except cosmos_exceptions.CosmosHttpResponseError as e: + raise RuntimeError(f"Failed to retrieve sessions: {str(e)}") + + def load_or_create_chat_session(self, session_id: str) -> ChatSession: + """ + Load an existing session from the Cosmos DB container, or create a new one if not found. + """ + try: + # Try to read the session from Cosmos DB + session_item = chat_session_container.read_item(item=session_id, partition_key=session_id) + return ChatSession(**session_item) + except Exception: + # If the session is not found, create a new one + new_session = ChatSession( + id=session_id, + session_id=session_id, + title=f"{datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')}", + chat_history=[] + ) + chat_session_container.upsert_item(new_session.model_dump()) + return new_session + + def load_session(self, session_id: str) -> Optional[dict]: + """ + Loads a chat session by session ID. + + Args: + session_id (str): The ID of the session to be loaded. + + Returns: + Optional[dict]: The chat session data if found, else None. + """ + try: + query = f"SELECT * FROM c WHERE c.id = '{session_id}'" + session = list(self.container.query_items( + query=query, + enable_cross_partition_query=True + )) + + if session: + return session[0] + else: + raise ValueError("Session not found") + except cosmos_exceptions.CosmosHttpResponseError as e: + raise RuntimeError(f"Failed to retrieve session: {str(e)}") + + def upsert_session(self, session: ChatSession) -> dict: + """ + Creates or updates a chat session in the chat session container. + + Args: + session: The chat session to create or update. + + Returns: + dict: The upserted session data. + """ + try: + response = self.container.upsert_item(session.model_dump()) + return response + except cosmos_exceptions.CosmosHttpResponseError as e: + raise RuntimeError(f"Failed to create or update session: {str(e)}") + + # def delete_session(self, session_id: str) -> None: + # """ + # Deletes a chat session by session ID. + + # Args: + # session_id (str): The ID of the session to delete. + # """ + # try: + # self.container.delete_item(item=session_id, partition_key=session_id) + # except cosmos_exceptions.CosmosResourceNotFoundError: + # raise ValueError(f"Session with ID '{session_id}' not found") + # except cosmos_exceptions.CosmosHttpResponseError as e: + # raise RuntimeError(f"Failed to delete session: {str(e)}") diff --git a/diskann/Backend/cosmic_works/cosmic_works_ai_agent.py b/diskann/Backend/cosmic_works/cosmic_works_ai_agent.py new file mode 100644 index 0000000..e736a80 --- /dev/null +++ b/diskann/Backend/cosmic_works/cosmic_works_ai_agent.py @@ -0,0 +1,205 @@ +""" +Class: CosmicWorksAIAgent +Description: + The CosmicWorksAIAgent class creates Cosmo, an AI agent + that can be used to answer questions about Cosmic Works + products, customers, and sales. +""" +import os +import json +from pydantic import BaseModel +from typing import Type, TypeVar +from dotenv import load_dotenv +from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings +from azure.cosmos import CosmosClient, ContainerProxy +from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder +from langchain_core.tools import StructuredTool +from langchain.agents.agent_toolkits import create_retriever_tool +from langchain.agents import AgentExecutor, create_openai_functions_agent +from models import Product, SalesOrder +from retrievers import AzureCosmosDBNoSQLRetriever + +from chat_session_state.cosmosdb_chat_session_state_provider import CosmosDBChatSessionStateProvider + +T = TypeVar('T', bound=BaseModel) + +# Load settings for the notebook +load_dotenv() +CONNECTION_STRING = os.environ.get("COSMOS_DB_CONNECTION_STRING") +EMBEDDINGS_DEPLOYMENT_NAME = "embeddings" +COMPLETIONS_DEPLOYMENT_NAME = "completions" +AOAI_ENDPOINT = os.environ.get("AOAI_ENDPOINT") +AOAI_KEY = os.environ.get("AOAI_KEY") +AOAI_API_VERSION = "2024-06-01" + +# Initialize the Azure Cosmos DB client, database and product (with vector) container +client = CosmosClient.from_connection_string(CONNECTION_STRING) +db = client.get_database_client("cosmic_works_pv") +product_v_container = db.get_container_client("product_v") +sales_order_container = db.get_container_client("salesOrder") + +# Create an instance of the CosmosDBChatSessionStateProvider class +# This will be used to load or create Chat Sessions +chat_session_state_provider = CosmosDBChatSessionStateProvider() + + +class CosmicWorksAIAgent: + """ + The CosmicWorksAIAgent class creates Cosmo, an AI agent + that can be used to answer questions about Cosmic Works + products, customers, and sales. + """ + def __init__(self, session_id: str): + self.session_id = session_id + + self.chat_session = chat_session_state_provider.load_or_create_chat_session(session_id) + + llm = AzureChatOpenAI( + temperature = 0, + openai_api_version = AOAI_API_VERSION, + azure_endpoint = AOAI_ENDPOINT, + openai_api_key = AOAI_KEY, + azure_deployment = COMPLETIONS_DEPLOYMENT_NAME + ) + embedding_model = AzureOpenAIEmbeddings( + openai_api_version = AOAI_API_VERSION, + azure_endpoint = AOAI_ENDPOINT, + openai_api_key = AOAI_KEY, + azure_deployment = EMBEDDINGS_DEPLOYMENT_NAME, + chunk_size=800 + ) + agent_instructions = """ + Your name is "Willie". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy. + Do not include citations or citation numbers in your responses. Do not include emojis. + You are designed to answer questions about the products that Cosmic Works sells, the customers that buy them, and the sales orders that are placed by customers. + If you don't know the answer to a question, respond with "I don't know." + Only answer questions related to Cosmic Works products, customers, and sales orders. + If a question is not related to Cosmic Works products, customers, or sales orders, + respond with "I only answer questions about Cosmic Works" + """ + prompt = ChatPromptTemplate.from_messages( + [ + ("system", agent_instructions), + MessagesPlaceholder("chat_history", optional=True), + ("human", "{input}"), + MessagesPlaceholder("agent_scratchpad"), + ] + ) + products_retriever = AzureCosmosDBNoSQLRetriever( + embedding_model = embedding_model, + container = product_v_container, + model = Product, + vector_field_name = "contentVector", + num_results = 5 + ) + tools = [create_retriever_tool( + retriever = products_retriever, + name = "vector_search_products", + description = "Searches Cosmic Works product information for similar products based on the question. Returns the product information in JSON format." + ), + StructuredTool.from_function(get_product_by_id), + StructuredTool.from_function(get_product_by_sku), + StructuredTool.from_function(get_sales_by_id)] + agent = create_openai_functions_agent(llm, tools, prompt) + self.agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True) + + def run(self, prompt: str) -> str: + """ + Run the AI agent. + """ + + # Add the existing chat history to the prompt + chat_history = [{"role": msg["role"], "content": msg["content"]} for msg in self.chat_session.history] + full_prompt = { + "input": prompt, + "chat_history": chat_history + } + + # Run the AI agent with the chat history context + result = self.agent_executor.invoke(full_prompt) + response = result["output"] + + # Update session chat history with new interaction + self.chat_session.history.append({"role": "user", "content": prompt}) + self.chat_session.history.append({"role": "assistant", "content": response}) + + # Save updated session chat history to Cosmos DB + chat_session_state_provider.upsert_session(self.chat_session) + + return response + +# Tools helper methods +def delete_attribute_by_alias(instance: BaseModel, alias:str): + """ + Removes an attribute from a Pydantic model instance by its alias. + """ + for model_field in instance.model_fields: + field = instance.model_fields[model_field] + if field.alias == alias: + delattr(instance, model_field) + return + +def get_single_item_by_field_name( + container:ContainerProxy, + field_name:str, + field_value:str, + model:Type[T]) -> T: + """ + Retrieves a single item from the Azure Cosmos DB NoSQL database by a specific field and value. + """ + query = f"SELECT TOP 1 * FROM itm WHERE itm.{field_name} = @value" + parameters = [ + { + "name": "@value", + "value": field_value + } + ] + items = list(container.query_items( + query=query, + parameters=parameters, + enable_cross_partition_query=True + )) + + # Check if any item is returned + if not items: + return None # Return None if no item is found + + # Cast the item to the provided model + item_casted = model(**items[0]) + return item_casted + # item = list(container.query_items( + # query=query, + # parameters=parameters, + # enable_cross_partition_query=True + # ))[0] + # item_casted = model(**item) + # return item_casted + +def get_product_by_id(product_id: str) -> str: + """ + Retrieves a product by its ID. + """ + item = get_single_item_by_field_name(product_v_container, "id", product_id, Product) + if item is None: + return json.dumps({"error": "Product with 'id' ({id}) not found."}, indent=4) + delete_attribute_by_alias(item, "contentVector") + return json.dumps(item, indent=4, default=str) + +def get_product_by_sku(sku: str) -> str: + """ + Retrieves a product by its sku. + """ + item = get_single_item_by_field_name(product_v_container, "sku", sku, Product) + if item is None: + return json.dumps({"error": "Product with 'sku' ({sku}) not found."}, indent=4) + delete_attribute_by_alias(item, "contentVector") + return json.dumps(item, indent=4, default=str) + +def get_sales_by_id(sales_id: str) -> str: + """ + Retrieves a sales order by its ID. + """ + item = get_single_item_by_field_name(sales_order_container, "id", sales_id, SalesOrder) + if item is None: + return json.dumps({"error": "SalesOrder with 'id' ({id}) not found."}, indent=4) + return json.dumps(item, indent=4, default=str) diff --git a/Labs/models/__init__.py b/diskann/Backend/models/__init__.py similarity index 100% rename from Labs/models/__init__.py rename to diskann/Backend/models/__init__.py diff --git a/Labs/models/address.py b/diskann/Backend/models/address.py similarity index 100% rename from Labs/models/address.py rename to diskann/Backend/models/address.py diff --git a/diskann/Backend/models/customer.py b/diskann/Backend/models/customer.py new file mode 100644 index 0000000..2942078 --- /dev/null +++ b/diskann/Backend/models/customer.py @@ -0,0 +1,48 @@ +""" +Customer and CustomerList models +""" +from datetime import datetime +from typing import List, Optional +from pydantic import BaseModel, Field +from .address import Address +from .password import Password + +class Customer(BaseModel): + """ + The Customer class represents a customer in the + Cosmic Works dataset. + + The alias feelds are used to map the dataset + field names to the pythonic property names. + """ + id: str = Field(alias="id") + customer_id: str = Field(alias="customerId") + title: Optional[str] + first_name: str = Field(alias="firstName") + last_name: str = Field(alias="lastName") + email_address: str = Field(alias="emailAddress") + phone_number: str = Field(alias="phoneNumber") + creation_date: datetime = Field(alias="creationDate") + addresses: List[Address] + password: Password + sales_order_count: int = Field(alias="salesOrderCount") + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + json_encoders = { + datetime: lambda v: v.isoformat() + } + +class CustomerList(BaseModel): + """ + The CustomerList class represents a list of customers. + This class is used when deserializing a container/array + of customers. + """ + items: List[Customer] diff --git a/Labs/models/password.py b/diskann/Backend/models/password.py similarity index 100% rename from Labs/models/password.py rename to diskann/Backend/models/password.py diff --git a/diskann/Backend/models/product.py b/diskann/Backend/models/product.py new file mode 100644 index 0000000..6613df5 --- /dev/null +++ b/diskann/Backend/models/product.py @@ -0,0 +1,38 @@ +""" +Product model +""" +from typing import List, Optional +from pydantic import BaseModel, Field +from .tag import Tag + +class Product(BaseModel): + """ + The Product class represents a product in the + Cosmic Works dataset. + """ + id: str = Field(default=None, alias="id") + category_id: str = Field(alias="categoryId") + category_name: str = Field(alias="categoryName") + sku: str + name: str + description: str + price: float + tags: Optional[List[Tag]] = [] + content_vector: Optional[List[float]] = Field(default=[], alias="contentVector") + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + +class ProductList(BaseModel): + """ + The ProductList class represents a list of products. + This class is used when deserializing a collection/array + of products. + """ + items: List[Product] diff --git a/diskann/Backend/models/sales_order.py b/diskann/Backend/models/sales_order.py new file mode 100644 index 0000000..7b1bb8b --- /dev/null +++ b/diskann/Backend/models/sales_order.py @@ -0,0 +1,39 @@ +""" +SalesOrder model +""" +from datetime import datetime +from typing import List +from pydantic import BaseModel, Field +from .sales_order_detail import SalesOrderDetail + +class SalesOrder(BaseModel): + """ + The SalesOrder class represents a sales order in the + Cosmic Works dataset. + """ + id: str = Field(alias="id") + customer_id: str = Field(alias="customerId") + order_date: datetime = Field(alias="orderDate") + ship_date: datetime = Field(alias="shipDate") + details: List[SalesOrderDetail] + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + json_encoders = { + datetime: lambda v: v.isoformat() + } + +class SalesOrderList(BaseModel): + """ + The SalesOrderList class represents a list of sales orders. + + This class is used when deserializing a container/array + of sales orders. + """ + items: List[SalesOrder] diff --git a/Labs/models/sales_order_detail.py b/diskann/Backend/models/sales_order_detail.py similarity index 100% rename from Labs/models/sales_order_detail.py rename to diskann/Backend/models/sales_order_detail.py diff --git a/Labs/models/tag.py b/diskann/Backend/models/tag.py similarity index 100% rename from Labs/models/tag.py rename to diskann/Backend/models/tag.py diff --git a/diskann/Backend/requirements.txt b/diskann/Backend/requirements.txt new file mode 100644 index 0000000..085afc5 --- /dev/null +++ b/diskann/Backend/requirements.txt @@ -0,0 +1,11 @@ +azure-cosmos==4.7.0 +python-dotenv==1.0.1 +requests==2.32.3 +pydantic==2.9.1 +openai==1.45.0 +tenacity==8.5.0 +langchain==0.3.0 +langchain-openai==0.2.0 +tiktoken==0.7.0 +fastapi==0.114.2 +uvicorn==0.30.6 diff --git a/diskann/Backend/retrievers/__init__.py b/diskann/Backend/retrievers/__init__.py new file mode 100644 index 0000000..7fb0462 --- /dev/null +++ b/diskann/Backend/retrievers/__init__.py @@ -0,0 +1 @@ +from .azure_cosmos_db_nosql_retriever import AzureCosmosDBNoSQLRetriever \ No newline at end of file diff --git a/diskann/Backend/retrievers/azure_cosmos_db_nosql_retriever.py b/diskann/Backend/retrievers/azure_cosmos_db_nosql_retriever.py new file mode 100644 index 0000000..de75e7a --- /dev/null +++ b/diskann/Backend/retrievers/azure_cosmos_db_nosql_retriever.py @@ -0,0 +1,90 @@ + +import time +import json +from langchain_core.retrievers import BaseRetriever +from langchain_openai import AzureOpenAIEmbeddings +from azure.cosmos import ContainerProxy +from pydantic import BaseModel +from typing import Type, TypeVar, List +from langchain_core.callbacks import ( + AsyncCallbackManagerForRetrieverRun, + CallbackManagerForRetrieverRun, +) +from langchain_core.documents import Document + + +T = TypeVar('T', bound=BaseModel) + +class AzureCosmosDBNoSQLRetriever(BaseRetriever): + """ + A custom LangChain retriever that uses Azure Cosmos DB NoSQL database for vector search. + """ + embedding_model: AzureOpenAIEmbeddings + container: ContainerProxy + model: Type[T] + vector_field_name: str + num_results: int=5 + + def __get_embeddings(self, text: str) -> List[float]: + """ + Returns embeddings vector for a given text. + """ + embedding = self.embedding_model.embed_query(text) + time.sleep(0.5) # rest period to avoid rate limiting on AOAI + return embedding + + def __get_item_by_id(self, id) -> T: + """ + Retrieves a single item from the Azure Cosmos DB NoSQL database by its ID. + """ + query = "SELECT * FROM itm WHERE itm.id = @id" + parameters = [ + {"name": "@id", "value": id} + ] + item = list(self.container.query_items( + query=query, + parameters=parameters, + enable_cross_partition_query=True + ))[0] + return self.model(**item) + + def __delete_attribute_by_alias(self, instance: BaseModel, alias): + for model_field in instance.model_fields: + field = instance.model_fields[model_field] + if field.alias == alias: + delattr(instance, model_field) + return + + def _get_relevant_documents( + self, query: str, *, run_manager: CallbackManagerForRetrieverRun + ) -> List[Document]: + """ + Performs a synchronous vector search on the Azure Cosmos DB NoSQL database. + """ + embedding = self.__get_embeddings(query) + items = self.container.query_items( + query=f"""SELECT TOP @num_results itm.id, VectorDistance(itm.{self.vector_field_name}, @embedding) AS SimilarityScore + FROM itm + ORDER BY VectorDistance(itm.{self.vector_field_name}, @embedding) + """, + parameters = [ + { "name": "@num_results", "value": self.num_results }, + { "name": "@embedding", "value": embedding } + ], + enable_cross_partition_query=True + ) + returned_docs = [] + for item in items: + itm = self.__get_item_by_id(item["id"]) + # Remove the vector field from the returned item so it doesn't fill the context window + self.__delete_attribute_by_alias(itm, self.vector_field_name) + returned_docs.append(Document(page_content=json.dumps(itm, indent=4, default=str), metadata={"similarity_score": item["SimilarityScore"]})) + return returned_docs + + async def _aget_relevant_documents( + self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun + ) -> List[Document]: + """ + Performs an asynchronous vector search on the Azure Cosmos DB NoSQL database. + """ + raise Exception(f"Asynchronous search not implemented.") \ No newline at end of file diff --git a/diskann/Backend/run-local.sh b/diskann/Backend/run-local.sh new file mode 100755 index 0000000..b226057 --- /dev/null +++ b/diskann/Backend/run-local.sh @@ -0,0 +1,4 @@ +pip install -r requirements.txt + +uvicorn app:app --host "0.0.0.0" --port 4242 --forwarded-allow-ips "*" --proxy-headers + diff --git a/diskann/Labs/.env.EXAMPLE b/diskann/Labs/.env.EXAMPLE new file mode 100644 index 0000000..eac80b9 --- /dev/null +++ b/diskann/Labs/.env.EXAMPLE @@ -0,0 +1,3 @@ +COSMOS_DB_CONNECTION_STRING="AccountEndpoint=https://.documents.azure.com:443/;AccountKey=;" +AOAI_ENDPOINT = "https://.openai.azure.com/" +AOAI_KEY = "" diff --git a/Labs/.gitignore b/diskann/Labs/.gitignore similarity index 100% rename from Labs/.gitignore rename to diskann/Labs/.gitignore diff --git a/diskann/Labs/deploy/azuredeploy.bicep b/diskann/Labs/deploy/azuredeploy.bicep new file mode 100644 index 0000000..3854981 --- /dev/null +++ b/diskann/Labs/deploy/azuredeploy.bicep @@ -0,0 +1,378 @@ +/* *************************************************************** +Azure Cosmos DB + Azure OpenAI Python developer guide lab +****************************************************************** +This Azure resource deployment template uses some of the following practices: +- [Abbrevation examples for Azure resources](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/azure-best-practices/resource-abbreviations) +*/ + +/* *************************************************************** */ +/* Parameters */ +/* *************************************************************** */ + +@description('Location where all resources will be deployed. This value defaults to the **East US 2** region.') +@allowed([ + 'eastus2' + 'francecentral' + 'centralus' + 'uksouth' + 'northeurope' +]) +param location string = 'eastus2' + +@description(''' +Unique name for the deployed services below. Max length 17 characters, alphanumeric only: +- Azure Cosmos DB for NoSQL +- Azure OpenAI Service + +The name defaults to a unique string generated from the resource group identifier. Prefixed with +**dg** 'developer guide' as the id may start with a number which is an invalid name for +many resources. +''') +@maxLength(17) +param name string = 'dg${uniqueString(resourceGroup().id)}' + +@description('Specifies the SKU for the Azure App Service plan. Defaults to **B1**') +@allowed([ + 'B1' + 'S1' + 'P0v3' +]) +param appServiceSku string = 'P0v3' //'B1' + +@description('Specifies the SKU for the Azure OpenAI resource. Defaults to **S0**') +@allowed([ + 'S0' +]) +param openAiSku string = 'S0' + +@description('Azure Container Registry SKU. Defaults to **Basic**') +param acrSku string = 'Basic' + +/* *************************************************************** */ +/* Variables */ +/* *************************************************************** */ + +var openAiSettings = { + name: '${name}-openai' + sku: openAiSku + maxConversationTokens: '100' + maxCompletionTokens: '500' + completionsModel: { + name: 'gpt-35-turbo' + version: '0613' + deployment: { + name: 'completions' + } + sku: { + name: 'Standard' + capacity: 120 + } + } + embeddingsModel: { + name: 'text-embedding-ada-002' + version: '2' + deployment: { + name: 'embeddings' + } + sku: { + name: 'Standard' + capacity: 120 + } + } +} + +var appServiceSettings = { + plan: { + name: '${name}-web' + sku: appServiceSku + } + web: { + name: '${name}-web' + git: { + repo: 'https://github.com/AzureCosmosDB/Azure-OpenAI-Developer-Guide-Front-End.git' + branch: 'main' + } + } +} + +// Define a variable for the tag values +var tags = { + name: 'AzureCosmosDB-DevGuide' + repo: 'https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/diskann/README.md' +} + +/* *************************************************************** */ +/* Azure Cosmos DB for NoSQL */ +/* *************************************************************** */ + +resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2024-05-15' = { + name: '${name}-cosmos' + location: location + kind: 'GlobalDocumentDB' + tags: tags + properties: { + databaseAccountOfferType: 'Standard' + enableMultipleWriteLocations: false + enableAutomaticFailover: true + locations: [ + { + locationName: location + failoverPriority: 0 + isZoneRedundant: false + } + ] + capabilities: [ + { + name: 'EnableServerless' + } + { + /* https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search#enroll-in-the-vector-search-preview-feature */ + name: 'EnableNoSQLVectorSearch' + } + ] + capacity: { + totalThroughputLimit: 4000 + } + } +} + +/* *************************************************************** */ +/* Azure OpenAI */ +/* *************************************************************** */ + +resource openAiAccount 'Microsoft.CognitiveServices/accounts@2023-05-01' = { + name: openAiSettings.name + location: location + sku: { + name: openAiSettings.sku + } + tags: tags + kind: 'OpenAI' + properties: { + customSubDomainName: openAiSettings.name + publicNetworkAccess: 'Enabled' + } +} + +resource openAiEmbeddingsModelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = { + parent: openAiAccount + name: openAiSettings.embeddingsModel.deployment.name + sku: { + name: openAiSettings.embeddingsModel.sku.name + capacity: openAiSettings.embeddingsModel.sku.capacity + } + properties: { + model: { + format: 'OpenAI' + name: openAiSettings.embeddingsModel.name + version: openAiSettings.embeddingsModel.version + } + } +} + +resource openAiCompletionsModelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = { + parent: openAiAccount + name: openAiSettings.completionsModel.deployment.name + dependsOn: [ + openAiEmbeddingsModelDeployment + ] + sku: { + name: openAiSettings.completionsModel.sku.name + capacity: openAiSettings.completionsModel.sku.capacity + } + properties: { + model: { + format: 'OpenAI' + name: openAiSettings.completionsModel.name + version: openAiSettings.completionsModel.version + } + } +} + +/* *************************************************************** */ +/* Logging and instrumentation */ +/* *************************************************************** */ + +resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2021-06-01' = { + name: '${name}-loganalytics' + location: location + tags: tags + properties: { + sku: { + name: 'PerGB2018' + } + } +} +resource appServiceWebInsights 'Microsoft.Insights/components@2020-02-02' = { + name: '${appServiceSettings.web.name}-appi' + location: location + tags: tags + kind: 'web' + properties: { + Application_Type: 'web' + WorkspaceResourceId: logAnalytics.id + } +} + +/* *************************************************************** */ +/* App Plan Hosting - Azure App Service Plan */ +/* *************************************************************** */ +resource appServicePlan 'Microsoft.Web/serverfarms@2022-03-01' = { + name: '${appServiceSettings.plan.name}-asp' + location: location + tags: tags + sku: { + name: appServiceSettings.plan.sku + } + kind: 'linux' + properties: { + reserved: true + } +} + + +/* *************************************************************** */ +/* Front-end Web App Hosting - Azure App Service */ +/* *************************************************************** */ + +resource appServiceWeb 'Microsoft.Web/sites@2022-03-01' = { + name: appServiceSettings.web.name + location: location + tags: tags + properties: { + serverFarmId: appServicePlan.id + httpsOnly: true + siteConfig: { + linuxFxVersion: 'NODE|20-lts' + appCommandLine: 'pm2 serve /home/site/wwwroot/dist --no-daemon --spa' + alwaysOn: true + } + } +} + +resource appServiceWebSettings 'Microsoft.Web/sites/config@2022-03-01' = { + parent: appServiceWeb + name: 'appsettings' + kind: 'string' + properties: { + APPINSIGHTS_INSTRUMENTATIONKEY: appServiceWebInsights.properties.InstrumentationKey + API_ENDPOINT: 'https://${backendApiContainerApp.properties.configuration.ingress.fqdn}' + } +} + +resource appServiceWebDeployment 'Microsoft.Web/sites/sourcecontrols@2021-03-01' = { + parent: appServiceWeb + name: 'web' + properties: { + repoUrl: appServiceSettings.web.git.repo + branch: appServiceSettings.web.git.branch + isManualIntegration: true + } +} + + +/* *************************************************************** */ +/* Registry for Back-end API Image - Azure Container Registry */ +/* *************************************************************** */ +resource containerRegistry 'Microsoft.ContainerRegistry/registries@2023-01-01-preview' = { + name: replace('${name}registry','-', '') + location: location + tags: tags + sku: { + name: acrSku + } + properties: { + adminUserEnabled: true + } +} + +/* *************************************************************** */ +/* Container environment - Azure Container App Environment */ +/* *************************************************************** */ +resource containerAppEnvironment 'Microsoft.App/managedEnvironments@2023-05-01' = { + name: '${name}-containerappenv' + location: location + tags: tags + properties: { + appLogsConfiguration: { + destination: 'log-analytics' + logAnalyticsConfiguration: { + customerId: logAnalytics.properties.customerId + sharedKey: logAnalytics.listKeys().primarySharedKey + } + } + workloadProfiles: [ + { + name: 'Warm' + minimumCount: 1 + maximumCount: 10 + workloadProfileType: 'E4' + } + ] + infrastructureResourceGroup: 'ME_${resourceGroup().name}' + } +} + +/* *************************************************************** */ +/* Back-end API App Application - Azure Container App */ +/* deploys default hello world */ +/* *************************************************************** */ +resource backendApiContainerApp 'Microsoft.App/containerApps@2023-05-01' = { + name: '${name}-api' + location: location + tags: tags + properties: { + environmentId: containerAppEnvironment.id + configuration: { + ingress: { + external: true + targetPort: 80 + allowInsecure: false + traffic: [ + { + latestRevision: true + weight: 100 + } + ] + corsPolicy: { + allowCredentials: false + allowedHeaders: [ + '*' + ] + allowedOrigins: [ + '*' + ] + } + } + registries: [ + { + server: containerRegistry.name + username: containerRegistry.properties.loginServer + passwordSecretRef: 'container-registry-password' + } + ] + secrets: [ + { + name: 'container-registry-password' + value: containerRegistry.listCredentials().passwords[0].value + } + ] + } + template: { + containers: [ + { + name: 'hello-world' + image: 'mcr.microsoft.com/azuredocs/containerapps-helloworld:latest' + resources: { + cpu: 1 + memory: '2Gi' + } + } + ] + scale: { + minReplicas: 1 + maxReplicas: 1 + } + } + } +} diff --git a/diskann/Labs/deploy/azuredeploy.parameters.json b/diskann/Labs/deploy/azuredeploy.parameters.json new file mode 100644 index 0000000..cfad380 --- /dev/null +++ b/diskann/Labs/deploy/azuredeploy.parameters.json @@ -0,0 +1,12 @@ +{ + "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", + "contentVersion": "1.0.0.0", + "parameters": { + "location": { + "value": "eastus2" + }, + "openAiSku": { + "value": "S0" + } + } +} \ No newline at end of file diff --git a/diskann/Labs/deploy/deploy.md b/diskann/Labs/deploy/deploy.md new file mode 100644 index 0000000..399d3dc --- /dev/null +++ b/diskann/Labs/deploy/deploy.md @@ -0,0 +1,58 @@ +# Solution deployment + +## Prerequisites + +- Owner on Azure subscription +- Account approved for Azure OpenAI service +- Azure CLI installed +- Azure PowerShell installed + +## Clone the repository + +Create a folder to house the repository. Open a terminal and navigate to the folder. Clone the repository, then navigate to the `Labs/deploy` folder within the repository. + +```bash +git clone https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide.git + +cd Cosmos-DB-NoSQL-OpenAI-Python-Dev-Guide +cd diskann +cd Labs +cd deploy +``` + +Open the `azuredeploy.parameters.json` file, and inspect the values, modify as deemed appropriate. + +## Login to Azure + +Open a terminal window and log in to Azure using the following command: + +```Powershell +Connect-AzAccount +``` + +### Set the desired subscription (Optional) + +If you have more than one subscription associated with your account, set the desired subscription using the following command: + +```Powershell +Set-AzContext -SubscriptionId +``` + +## Create resource group + +```Powershell +New-AzResourceGroup -Name cosmos-devguide-rg -Location 'eastus2' +``` + +## Deploy using bicep template + +Deploy the solution resources using the following command (this will take a few minutes to run): + +```Powershell +New-AzResourceGroupDeployment -ResourceGroupName cosmos-devguide-rg -TemplateFile .\azuredeploy.bicep -TemplateParameterFile .\azuredeploy.parameters.json -c +``` + +> **Enable Vector Search Feature**: This Azure Bicep template will automatically [enable the "Vector Search" feature within Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search#enroll-in-the-vector-search-preview-feature). If it's not enabled, this Azure PowerShell command can be run to enable it on an Azure Cosmos DB for NoSQL Account: +> ````powershell +> Update-AzCosmosDBAccount -ResourceGroupName -Name -Capabilities @{name="EnableNoSQLVectorSearch"} +> ```` diff --git a/diskann/Labs/lab_0_explore_and_use_models.ipynb b/diskann/Labs/lab_0_explore_and_use_models.ipynb new file mode 100644 index 0000000..04b1bb9 --- /dev/null +++ b/diskann/Labs/lab_0_explore_and_use_models.ipynb @@ -0,0 +1,151 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Explore and use Azure OpenAI models from code" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install Requirements\n", + "\n", + "Before we can start running the Python code below, we need to install the necessary Python libraries required.\n", + "\n", + "Run the following command to install the Python libraries required for this lab, as listed within the `requirements.txt` file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install -r requirements.txt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### OpenAI Client Library\n", + "\n", + "When integrating Azure OpenAI service in a solution written in Python, the OpenAI Python client library is used. This library is maintained by OpenAI, and is compatible with the Azure OpenAI service.\n", + "\n", + "When using the OpenAI client library, the Azure OpenAI `key` and `endpoint` for the service are needed. In this case, ensure the Azure OpenAI `key` and `endpoint` is located in a `.env` file in the root of this project, you will need to create this file. The `.env` file should contain the following values (replace the value with your own `key` and `endpoint`):\n", + "\n", + "```\n", + "AOAI_ENDPOINT = \"https://.openai.azure.com/\"\n", + "\n", + "AOAI_KEY = \"\"\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following imports are neded in python so the app can use the OpenAI library, as well as `os` to access the environment variables, and `dotenv` is used here to load environment variables from the `.env` file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "from openai import AzureOpenAI\n", + "from dotenv import load_dotenv\n", + "\n", + "load_dotenv()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Chat completions\n", + "\n", + "Create the Azure OpenAi client to call the Azure OpenAI **Chat completion** API: " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "chatClient = AzureOpenAI(\n", + " azure_endpoint=os.getenv(\"AOAI_ENDPOINT\"), \n", + " api_key=os.getenv(\"AOAI_KEY\"), \n", + " api_version=\"2024-06-01\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> **Note**: The `api_version` is included to specify the API version for calls to the Azure OpenAI service." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once the Azure OpenAI client to be used for **Chat completion** has been created, the next step is to call the `.chat.completions.create()` method on the client to perform a chat completion." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "chatResponse = chatClient.chat.completions.create(\n", + " model=\"completions\",\n", + " messages=[\n", + " {\"role\": \"system\", \"content\": \"You are a helpful, fun and friendly sales assistant for Cosmic Works, a bicycle and bicycle accessories store.\"},\n", + " {\"role\": \"user\", \"content\": \"Do you sell bicycles?\"},\n", + " {\"role\": \"assistant\", \"content\": \"Yes, we do sell bicycles. What kind of bicycle are you looking for?\"},\n", + " {\"role\": \"user\", \"content\": \"I'm not sure what I'm looking for. Could you help me decide?\"}\n", + " ]\n", + ")\n", + "\n", + "print(chatResponse.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> **Note**: The [`openai` Python library documentation](https://platform.openai.com/docs/guides/text-generation/chat-completions-api) has further information on making Chat Completion calls to the service." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.6" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/diskann/Labs/lab_1_first_application.ipynb b/diskann/Labs/lab_1_first_application.ipynb new file mode 100644 index 0000000..713ea29 --- /dev/null +++ b/diskann/Labs/lab_1_first_application.ipynb @@ -0,0 +1,302 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# First Azure Cosmos DB for NoSQL application" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "from azure.cosmos import CosmosClient, DatabaseProxy, ContainerProxy\n", + "from pydantic import BaseModel\n", + "from typing import Type, TypeVar, List\n", + "from pprint import pprint\n", + "from dotenv import load_dotenv\n", + "from models import Product" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create a database\n", + "\n", + "Ensure the Azure Cosmos DB account connection string is located in a `.env` file in the root of the project, you will need to create this file. The `.env` file should contain the following value (replace the value with your own connection string):\n", + "\n", + "COSMOS_DB_CONNECTION_STRING=\"cosmos__db__connection_string\"\n", + "\n", + ">**Note**: If you are running using the **local emulator**, append the following value to the connection string: `&retrywrites=false&tlsallowinvalidcertificates=true`.\n", + "\n", + "To create a NoSQL database in Azure Cosmos DB, first instantiate a `CosmosClient` object, use the `create_database_if_not_exists` method to create a database if it does not exist to avoid any exceptions should the database already exist. This method will create a database with the specified name if it does not exist, otherwise it will return the existing database." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "load_dotenv()\n", + "CONNECTION_STRING = os.environ.get(\"COSMOS_DB_CONNECTION_STRING\")\n", + "\n", + "# Initialize the Azure Cosmos DB client\n", + "client = CosmosClient.from_connection_string(CONNECTION_STRING)\n", + "\n", + "# Create or load the cosmic_works_pv database\n", + "database_name = \"cosmic_works_pv\"\n", + "db = client.create_database_if_not_exists(id=database_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create a container\n", + "\n", + "There is a handy method that can be used to create a container in the database `create_container_if_not_exists` that allows for the creation of a container if it does not already exist, or retrieves it if it does. In this case, the `product` container is created to store product information.\n", + "\n", + "When creating a container, the partition key is required. Partition keys in Azure Cosmos DB are critical for ensuring scalable and efficient performance. They function as logical sharding mechanisms, distributing data across multiple partitions to balance the load and optimize query performance. It is referenced as a JSON path within the item being stored, prefixed with a `/`. Choosing an effective partition key affects the throughput, latency, and overall efficiency of database operations. Learn more about [partitioning in Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/partitioning-overview)." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "container: ContainerProxy = db.create_container_if_not_exists(\n", + " id=\"product\",\n", + " partition_key={\"paths\": [\"/categoryId\"], \"kind\": \"Hash\"}\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create or Update a document (Upsert)\n", + "\n", + "Documents in Azure Cosmos DB for NoSQL API are represented as JSON objects. In this lab, the Pydantic library is used to create a model for the document. This model is then used to create a document in the database using built-in serialization methods. Find the models in the `models` folder. Notice the class property definitions include aliases, these aliases can be used to override the serialized property names. This is useful when the property names in the model do not match the property names desired in the database.\n", + "\n", + "One method of creating a document is using the `create_item` method. This method takes a single document and inserts it into the database, if the item already exists in the container, and exception is thrown. Alternatively, the `upsert_item` method can also be used to insert a document into the database and in this case, if the document already exists, it will be updated." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product = Product(\n", + " id=\"2BA4A26C-A8DB-4645-BEB9-F7D42F50262E\", \n", + " category_id=\"56400CF3-446D-4C3F-B9B2-68286DA3BB99\", \n", + " category_name=\"Bikes, Mountain Bikes\", \n", + " sku=\"BK-M18S-42\",\n", + " name=\"Mountain-100 Silver, 42\",\n", + " description='The product called \"Mountain-500 Silver, 42\"',\n", + " price=742.42,\n", + " )\n", + "\n", + "# Upsert the product into the container by converting it to a dictionary using the alias names where present.\n", + "container.upsert_item(product.model_dump(by_alias=True))\n", + "\n", + "print(f\"Upserted product with ID: {product.id}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Read a document\n", + "\n", + "To read a document from the database, use the `read_item` method. This method takes the partition key and the document id as arguments and returns the document. If the document does not exist, an exception is thrown. The `query_items` method can also be used to retrieve documents from the database. This method takes a query string as an argument and returns a list of documents that match the query.\n", + "\n", + "In this case, the `query_items` method is used to retrieve the document from the container as it is desired to retrieve the record without also having to provide the partition key." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create a generic helper function to retrieve a an item from a container by its id value\n", + "T = TypeVar('T', bound=BaseModel)\n", + "def query_item_by_id(container, id, model: Type[T]) -> T:\n", + " query = \"SELECT * FROM itm WHERE itm.id = @id\"\n", + " parameters = [\n", + " {\"name\": \"@id\", \"value\": id}\n", + " ] \n", + " item = list(container.query_items(\n", + " query=query,\n", + " parameters=parameters,\n", + " enable_cross_partition_query=True\n", + " ))[0]\n", + " return model(**item)\n", + " \n", + "# Retrieve the product from the container by its id and cast it to the Product model\n", + "retrieved_product = query_item_by_id(container, product.id, Product)\n", + "\n", + "# Print the retrieved product\n", + "print(\"\\nCast Product from document retrieved from Azure Cosmos DB:\")\n", + "print(retrieved_product)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Delete a document\n", + "\n", + "The `delete_item` method is used to delete a single document from the database. This method takes the `id` and `partition_key` as arguments and deletes the document. If the document does not exist, an exception is thrown." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "container.delete_item(item=retrieved_product.id, partition_key=retrieved_product.category_id)\n", + "print(f\"Deleted the product with ID: {retrieved_product.id}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Query for multiple documents\n", + "\n", + "The `query_items` method is used to query for multiple documents in the database. This method takes a query string to perform a [SQL-like query](https://learn.microsoft.com/azure/cosmos-db/nosql/tutorial-query) on the documents in the container, retrieving all documents that match the query." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Insert multiple documents\n", + "products = [\n", + " Product(\n", + " id=\"2BA4A26C-A8DB-4645-BEB9-F7D42F50262E\", \n", + " category_id=\"56400CF3-446D-4C3F-B9B2-68286DA3BB99\", \n", + " category_name=\"Bikes, Mountain Bikes\", \n", + " sku=\"BK-M18S-42\",\n", + " name=\"Mountain-100 Silver, 42\",\n", + " description='The product called \"Mountain-500 Silver, 42\"',\n", + " price=742.42\n", + " ),\n", + " Product(\n", + " id=\"027D0B9A-F9D9-4C96-8213-C8546C4AAE71\", \n", + " category_id=\"26C74104-40BC-4541-8EF5-9892F7F03D72\", \n", + " category_name=\"Components, Saddles\", \n", + " sku=\"SE-R581\",\n", + " name=\"LL Road Seat/Saddle\",\n", + " description='The product called \"LL Road Seat/Saddle\"',\n", + " price=27.12\n", + " ),\n", + " Product(\n", + " id = \"4E4B38CB-0D82-43E5-89AF-20270CD28A04\",\n", + " category_id = \"75BF1ACB-168D-469C-9AA3-1FD26BB4EA4C\",\n", + " category_name = \"Bikes, Touring Bikes\",\n", + " sku = \"BK-T44U-60\",\n", + " name = \"Touring-2000 Blue, 60\",\n", + " description = 'The product called Touring-2000 Blue, 60\"',\n", + " price = 1214.85\n", + " ),\n", + " Product(\n", + " id = \"5B5E90B8-FEA2-4D6C-B728-EC586656FA6D\",\n", + " category_id = \"75BF1ACB-168D-469C-9AA3-1FD26BB4EA4C\",\n", + " category_name = \"Bikes, Touring Bikes\",\n", + " sku = \"BK-T79Y-60\",\n", + " name = \"Touring-1000 Yellow, 60\",\n", + " description = 'The product called Touring-1000 Yellow, 60\"',\n", + " price = 2384.07\n", + " ),\n", + " Product(\n", + " id = \"7BAA49C9-21B5-4EEF-9F6B-BCD6DA7C2239\",\n", + " category_id = \"26C74104-40BC-4541-8EF5-9892F7F03D72\",\n", + " category_name = \"Components, Saddles\",\n", + " sku = \"SE-R995\",\n", + " name = \"HL Road Seat/Saddle\",\n", + " description = 'The product called \"HL Road Seat/Saddle\"',\n", + " price = 52.64,\n", + " )\n", + "]\n", + "for product in products:\n", + " container.upsert_item(product.model_dump(by_alias=True))\n", + " print(f\"Upserted product with ID: {product.id}\")\n", + "\n", + "# Create generic helper function to query items a container.\n", + "# This function re-uses the TypeVar and BaseModel from the Read a document example.\n", + "def query_items(container, query, model: Type[T]) -> List[T]:\n", + " query = query\n", + " items = container.query_items(query=query, enable_cross_partition_query=True)\n", + " return [model(**item) for item in items]\n", + "\n", + "# retrieve all products via a query\n", + "retrieved_products = query_items(container,\"SELECT * FROM prod\", Product)\n", + "print(f\"Retrieved: {len(retrieved_products)} products\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Print all documents that have a category name of \"Components, Saddles\"\n", + "for result in query_items(container, \"SELECT * FROM prod WHERE prod.categoryName='Components, Saddles'\", Product): \n", + " pprint(result)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Clean up resources\n", + "\n", + "The following cell will delete the database and container created in this lab. This is done by using the `delete_database` method on the database object. This method takes the name of the database to delete as an argument. If it is desired to simply delete the container, the `delete_container` method can be used on the database object. This method takes the name of the container to delete as an argument." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "# db.delete_container(\"products\")\n", + "client.delete_database(\"cosmic_works_pv\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.10" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/diskann/Labs/lab_2_load_data.ipynb b/diskann/Labs/lab_2_load_data.ipynb new file mode 100644 index 0000000..14b0dd4 --- /dev/null +++ b/diskann/Labs/lab_2_load_data.ipynb @@ -0,0 +1,192 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Load data into Azure Cosmos DB for NosQL API\n", + "\n", + "This notebook demonstrates how to load data into Azure Cosmos DB from Cosmic Works JSON files from Azure Storage into the database using the NoSQL API." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import json\n", + "import requests\n", + "from models import Product, ProductList, Customer, CustomerList, SalesOrder, SalesOrderList\n", + "from azure.cosmos import CosmosClient, DatabaseProxy, ContainerProxy\n", + "from dotenv import load_dotenv" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Establish a connection to the database" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "load_dotenv()\n", + "CONNECTION_STRING = os.environ.get(\"COSMOS_DB_CONNECTION_STRING\")\n", + "\n", + "# Initialize the Azure Cosmos DB client\n", + "client = CosmosClient.from_connection_string(CONNECTION_STRING)\n", + "\n", + "# Create or load the cosmic_works_pv database\n", + "database_name = \"cosmic_works_pv\"\n", + "db = client.create_database_if_not_exists(id=database_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load products" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Add product data to database using upsert\n", + "# Get cosmic works product data from github\n", + "product_raw_data = \"https://cosmosdbcosmicworks.blob.core.windows.net/cosmic-works-small/product.json\"\n", + "product_data = ProductList(items=[Product(**data) for data in requests.get(product_raw_data).json()])\n", + "\n", + "# Create or retrieve the product container\n", + "product_container: ContainerProxy = db.create_container_if_not_exists(\n", + " id=\"product\",\n", + " partition_key={\"paths\": [\"/categoryId\"], \"kind\": \"Hash\"}\n", + " )\n", + "\n", + "# Upsert the product data to the container\n", + "for product in product_data.items:\n", + " product_container.upsert_item(product.model_dump(by_alias=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load customers and sales raw data\n", + "\n", + "In this repository, the customer and sales data are stored in the same file. The `type` field is used to differentiate between the two types of documents." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "customer_sales_raw_data = \"https://cosmosdbcosmicworks.blob.core.windows.net/cosmic-works-small/customer.json\"\n", + "response = requests.get(customer_sales_raw_data)\n", + "# override decoding\n", + "response.encoding = 'utf-8-sig'\n", + "response_json = response.json()\n", + "# filter where type is customer\n", + "customers = [cust for cust in response_json if cust[\"type\"] == \"customer\"]\n", + "# filter where type is salesOrder\n", + "sales_orders = [sales for sales in response_json if sales[\"type\"] == \"salesOrder\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load customers" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "customer_data = CustomerList(items=[Customer(**data) for data in customers])\n", + "# Create or retrieve the customer container\n", + "customer_container: ContainerProxy = db.create_container_if_not_exists(\n", + " id=\"customer\",\n", + " partition_key={\"paths\": [\"/customerId\"], \"kind\": \"Hash\"}\n", + " )\n", + "\n", + "# Upsert the customer data to the container\n", + "for customer in customer_data.items:\n", + " # Use json encoding to work around issue with datetime serialization\n", + " customer_json = customer.model_dump_json(by_alias=True)\n", + " customer_dict = json.loads(customer_json)\n", + " customer_container.upsert_item(customer_dict)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load sales orders" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "sales_order_data = SalesOrderList(items=[SalesOrder(**data) for data in sales_orders])\n", + "# Create or retrieve the salesOrder container\n", + "sales_order_container: ContainerProxy = db.create_container_if_not_exists(\n", + " id=\"salesOrder\",\n", + " partition_key={\"paths\": [\"/customerId\"], \"kind\": \"Hash\"}\n", + " )\n", + "\n", + "# Upsert the sales order data to the container, this will take approximately 1.5 minutes to run\n", + "for sales_order in sales_order_data.items:\n", + " # Use json encoding to work around issue with datetime serialization\n", + " sales_order_json = sales_order.model_dump_json(by_alias=True)\n", + " sales_order_dict = json.loads(sales_order_json)\n", + " sales_order_container.upsert_item(sales_order_dict)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Clean up\n", + "\n", + "No clean up is necessary as this data is used in subsequent labs." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.10" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/diskann/Labs/lab_3_cosmosdb_vector_search.ipynb b/diskann/Labs/lab_3_cosmosdb_vector_search.ipynb new file mode 100644 index 0000000..834685f --- /dev/null +++ b/diskann/Labs/lab_3_cosmosdb_vector_search.ipynb @@ -0,0 +1,419 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Vector Search using Azure Cosmos DB for NoSQL\n", + "\n", + "This notebook demonstrates using an Azure OpenAI embedding model to vectorize documents already stored in Azure Cosmos DB for NoSQL API, storing the embedding vectors and the creation of a vector index. Lastly, the notebook will demonstrate how to query the vector index to find similar documents.\n", + "\n", + "This lab expects the data that was loaded in Lab 2. A current limitation is that the vector search feature for Azure Cosmos DB for NoSQL is supported only on new containers so the vector policy needs to be applied during the time of container creation and it can’t be modified later, as such a new container `product_v` for products will be created in this notebook for use in this guide." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import json\n", + "from models import Product\n", + "from pydantic import BaseModel\n", + "from typing import Type, TypeVar, List\n", + "from azure.cosmos import CosmosClient, DatabaseProxy, ContainerProxy, PartitionKey\n", + "from dotenv import load_dotenv\n", + "import time\n", + "from openai import AzureOpenAI\n", + "from tenacity import retry, wait_random_exponential, stop_after_attempt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load settings\n", + "\n", + "This lab expects the `.env` file that was created in Lab 1 to obtain the connection string for the database.\n", + "\n", + "Add the following entries into the `.env` file to support the connection to Azure OpenAI API, replacing the values for `` and `` with the values from your Azure OpenAI API resource.\n", + "\n", + "```text\n", + "AOAI_ENDPOINT=\"\"\n", + "AOAI_KEY=\"\"\"\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "load_dotenv()\n", + "CONNECTION_STRING = os.environ.get(\"COSMOS_DB_CONNECTION_STRING\")\n", + "EMBEDDINGS_DEPLOYMENT_NAME = \"embeddings\"\n", + "COMPLETIONS_DEPLOYMENT_NAME = \"completions\"\n", + "AOAI_ENDPOINT = os.environ.get(\"AOAI_ENDPOINT\")\n", + "AOAI_KEY = os.environ.get(\"AOAI_KEY\")\n", + "AOAI_API_VERSION = \"2024-06-01\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Establish connectivity to the database" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the Azure Cosmos DB client\n", + "client = CosmosClient.from_connection_string(CONNECTION_STRING)\n", + "\n", + "# Create or load the cosmic_works_pv database\n", + "database_name = \"cosmic_works_pv\"\n", + "db = client.create_database_if_not_exists(id=database_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Establish Azure OpenAI connectivity" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "ai_client = AzureOpenAI(\n", + " azure_endpoint = AOAI_ENDPOINT,\n", + " api_version = AOAI_API_VERSION,\n", + " api_key = AOAI_KEY\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Vectorize and store the embeddings in each document\n", + "\n", + "The process of creating a vector embedding field on each document only needs to be done once. However, if a document changes, the vector embedding field will need to be updated with an updated vector." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(3))\n", + "def generate_embeddings(text: str):\n", + " '''\n", + " Generate embeddings from string of text using the deployed Azure OpenAI API embeddings model.\n", + " This will be used to vectorize document data and incoming user messages for a similarity search with\n", + " the vector index.\n", + " '''\n", + " response = ai_client.embeddings.create(input=text, model=EMBEDDINGS_DEPLOYMENT_NAME)\n", + " embeddings = response.data[0].embedding\n", + " time.sleep(0.5) # rest period to avoid rate limiting on AOAI\n", + " return embeddings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# demonstrate embeddings generation using a test string\n", + "test = \"hello, world\"\n", + "print(generate_embeddings(test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Vectorize and update all product documents in the Cosmic Works database" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "# Create the vector embedding policy\n", + "vector_embedding_policy = {\n", + " \"vectorEmbeddings\": [\n", + " {\n", + " \"path\": \"/contentVector\",\n", + " \"dataType\": \"float32\",\n", + " \"distanceFunction\": \"cosine\",\n", + " \"dimensions\": 1536\n", + " }\n", + " ]\n", + "}\n", + "\n", + "# Create the indexing policy\n", + "indexing_policy = {\n", + " \"indexingMode\": \"consistent\", \n", + " \"automatic\": True, \n", + " \"includedPaths\": [\n", + " {\n", + " \"path\": \"/*\" \n", + " }\n", + " ],\n", + " \"excludedPaths\": [\n", + " {\n", + " \"path\": \"/\\\"_etag\\\"/?\"\n", + " },\n", + " {\n", + " \"path\": \"/contentVector/*\"\n", + " }\n", + " ],\n", + " \"vectorIndexes\": [\n", + " {\n", + " \"path\": \"/contentVector\",\n", + " \"type\": \"diskANN\"\n", + " }\n", + " ]\n", + "}\n", + "\n", + "product_v_container = db.create_container_if_not_exists(\n", + " id=\"product_v\",\n", + " partition_key=PartitionKey(path=\"/categoryId\"),\n", + " indexing_policy=indexing_policy,\n", + " vector_embedding_policy=vector_embedding_policy\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create vector embeddings for all products in the database\n", + "product_container: ContainerProxy = db.create_container_if_not_exists(\n", + " id=\"product\",\n", + " partition_key={\"paths\": [\"/categoryId\"], \"kind\": \"Hash\"}\n", + " )\n", + "\n", + "T = TypeVar('T', bound=BaseModel)\n", + "# Create generic helper function to query items a container.\n", + "# This function re-uses the TypeVar and BaseModel from the Read a document example.\n", + "def query_items(container, query, model: Type[T]) -> List[T]:\n", + " query = query\n", + " items = container.query_items(query=query, enable_cross_partition_query=True)\n", + " return [model(**item) for item in items]\n", + "\n", + "# retrieve all products via a query\n", + "retrieved_products = query_items(product_container,\"SELECT * FROM prod\", Product)\n", + "print(f\"Retrieved {len(retrieved_products)} products from the database.\")\n", + "\n", + "print(\"Starting the embedding of each product, this will take 3-5 minutes...\")\n", + "# Populate contentVector field for each product in the product_v container that has vector indexing enabled\n", + "for product in retrieved_products:\n", + " product.content_vector = generate_embeddings(product.model_dump_json(by_alias=True)) \n", + " product_v_container.upsert_item(product.model_dump(by_alias=True))\n", + "\n", + "print(\"Embedding complete and product_v container items updated.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use vector search in Azure Cosmos DB for NoSQL\n", + "\n", + "Now that each document has its associated vector embedding and the vector indexes have been created on each container, we can now use the vector search capabilities of Azure Cosmos DB for NoSQL." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "def vector_search(\n", + " container: ContainerProxy, \n", + " prompt: str, \n", + " vector_field_name:str=\"contentVector\", \n", + " num_results:int=5):\n", + " query_embedding = generate_embeddings(prompt) \n", + " items = container.query_items(\n", + " query=f\"\"\"SELECT TOP @num_results itm.id, VectorDistance(itm.{vector_field_name}, @embedding) AS SimilarityScore \n", + " FROM itm\n", + " ORDER BY VectorDistance(itm.{vector_field_name}, @embedding)\n", + " \"\"\",\n", + " parameters = [\n", + " { \"name\": \"@num_results\", \"value\": num_results },\n", + " { \"name\": \"@embedding\", \"value\": query_embedding } \n", + " ],\n", + " enable_cross_partition_query=True\n", + " )\n", + " return items" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "prompt = \"What bikes do you have?\"\n", + "results = vector_search(product_v_container, prompt)\n", + "for result in results:\n", + " print(result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "prompt = \"What do you have that is yellow?\"\n", + "results = vector_search(product_v_container, prompt, num_results=4)\n", + "for result in results:\n", + " print(result) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use vector search results in a RAG pattern with Chat GPT-3.5" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "# Define a generic function to query an item by its ID\n", + "def query_item_by_id(container, id, model: Type[T]) -> T:\n", + " query = \"SELECT * FROM itm WHERE itm.id = @id\"\n", + " parameters = [\n", + " {\"name\": \"@id\", \"value\": id}\n", + " ] \n", + " item = list(container.query_items(\n", + " query=query,\n", + " parameters=parameters,\n", + " enable_cross_partition_query=True\n", + " ))[0]\n", + " return model(**item)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "# A system prompt describes the responsibilities, instructions, and persona of the AI.\n", + "system_prompt = \"\"\"\n", + "Your name is \"Willie\". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy.\n", + "Do not include citations or citation numbers in your responses. Do not include emojis.\n", + "You are designed to answer questions about the products that Cosmic Works sells.\n", + "\n", + "Only answer questions related to the information provided in the list of products below that are represented\n", + "in JSON format.\n", + "\n", + "If you are asked a question that is not in the list, respond with \"I don't know.\"\n", + "\n", + "List of products:\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "def rag_with_vector_search(\n", + " container: ContainerProxy, \n", + " prompt: str, \n", + " vector_field_name:str=\"contentVector\", \n", + " num_results:int=5):\n", + " \"\"\"\n", + " Use the RAG model to generate a prompt using vector search results based on the\n", + " incoming question. \n", + " \"\"\"\n", + " # perform the vector search and build product list\n", + " results = vector_search(container, prompt, vector_field_name, num_results)\n", + " product_list = \"\"\n", + " for result in results:\n", + " # retrieve the product details\n", + " product = query_item_by_id(container, result[\"id\"], Product) \n", + " # remove the contentVector field from the product details, this isn't needed for the context\n", + " product.content_vector = None \n", + " product_list += json.dumps(product, indent=4, default=str) + \"\\n\\n\"\n", + "\n", + " # generate prompt for the LLM with vector results\n", + " formatted_prompt = system_prompt + product_list\n", + "\n", + " # prepare the LLM request\n", + " messages = [\n", + " {\"role\": \"system\", \"content\": formatted_prompt},\n", + " {\"role\": \"user\", \"content\": prompt}\n", + " ]\n", + "\n", + " completion = ai_client.chat.completions.create(messages=messages, model=COMPLETIONS_DEPLOYMENT_NAME)\n", + " return completion.choices[0].message.content" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(rag_with_vector_search(product_v_container, \"What bikes do you have?\"))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(rag_with_vector_search(product_v_container, \"What are the names and skus of yellow products?\"))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.10" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/diskann/Labs/lab_4_langchain.ipynb b/diskann/Labs/lab_4_langchain.ipynb new file mode 100644 index 0000000..53b2c5e --- /dev/null +++ b/diskann/Labs/lab_4_langchain.ipynb @@ -0,0 +1,512 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# LangChain" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the previous lab, the `azure-cosmos` library was used to perform a vector search through a db command to find product documents that were most similar to the user's input. LangChain has a vector store class named [**AzureCosmosDBNoSqlVectorSearch**](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/vectorstores/azure_cosmos_db_no_sql.ipynb), that supports vector search in Azure Cosmos DB for NoSQL. However, at the time of this writing, due to the pace of LangChain development, the current implementation has a bug that impacts the retrieval of search results. As such, the code in this lab will create a LangChain [retriever class](https://python.langchain.com/docs/integrations/retrievers/) to connect to and search the vector store using the `azure-cosmos` library. More on retrievers in a moment." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import json\n", + "import time\n", + "from pydantic import BaseModel\n", + "from typing import Type, TypeVar, List\n", + "from dotenv import load_dotenv\n", + "from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings\n", + "from azure.cosmos import CosmosClient, ContainerProxy\n", + "from langchain_core.documents import Document\n", + "from langchain_core.retrievers import BaseRetriever\n", + "from langchain_core.callbacks import (\n", + " AsyncCallbackManagerForRetrieverRun,\n", + " CallbackManagerForRetrieverRun,\n", + ")\n", + "from langchain_core.prompts import PromptTemplate, ChatPromptTemplate, MessagesPlaceholder\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "from langchain_core.runnables import RunnablePassthrough\n", + "from langchain_core.tools import StructuredTool\n", + "from langchain.agents.agent_toolkits import create_retriever_tool\n", + "from langchain.agents import AgentExecutor, create_openai_functions_agent\n", + "from models import Product, SalesOrder\n", + "\n", + "T = TypeVar('T', bound=BaseModel)" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "# Load settings for the notebook\n", + "load_dotenv()\n", + "CONNECTION_STRING = os.environ.get(\"COSMOS_DB_CONNECTION_STRING\")\n", + "EMBEDDINGS_DEPLOYMENT_NAME = \"embeddings\"\n", + "COMPLETIONS_DEPLOYMENT_NAME = \"completions\"\n", + "AOAI_ENDPOINT = os.environ.get(\"AOAI_ENDPOINT\")\n", + "AOAI_KEY = os.environ.get(\"AOAI_KEY\")\n", + "AOAI_API_VERSION = \"2024-06-01\"" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Establish Azure OpenAI connectivity\n", + "llm = AzureChatOpenAI( \n", + " temperature = 0,\n", + " openai_api_version = AOAI_API_VERSION,\n", + " azure_endpoint = AOAI_ENDPOINT,\n", + " openai_api_key = AOAI_KEY, \n", + " azure_deployment = \"completions\"\n", + ")\n", + "embedding_model = AzureOpenAIEmbeddings(\n", + " openai_api_version = AOAI_API_VERSION,\n", + " azure_endpoint = AOAI_ENDPOINT,\n", + " openai_api_key = AOAI_KEY, \n", + " azure_deployment = \"embeddings\",\n", + " chunk_size=800\n", + ")\n", + "\n", + "# Initialize the Azure Cosmos DB client, database and product (with vector) container\n", + "client = CosmosClient.from_connection_string(CONNECTION_STRING)\n", + "db = client.get_database_client(\"cosmic_works_pv\")\n", + "product_v_container = db.get_container_client(\"product_v\")\n", + "sales_order_container = db.get_container_client(\"salesOrder\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## RAG with LangChain\n", + "\n", + "Recall that in previous labs the `products_v` container was created with an indexing policy and vector embedding policy to enable vector search. Each item in the container contains a contentVector field that contains the vectorized embeddings of the document itself.\n", + "\n", + "In this section, we'll implement the RAG pattern using LangChain. In LangChain, a **retriever** is used to augment the prompt with contextual data. In this case, a custom LangChain retriever is needed. The return value of the invokation of retriever in LangChain is a list of `Document` objects. The LangChain `Document` class contains two properties: `page_content`, that represents the textual content that is typically used to augment the prompt, and `metadata` that contains all other attributes of the document. In this case, we'll use the document content as the `page_content` and include the similarity score as the metadata.\n", + "\n", + "We'll also define a reusable RAG [chain](https://python.langchain.com/docs/modules/chains/) to control the flow and behavior of the call into the LLM. This chain is defined using the LCEL syntax (LangChain Expression Language)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "class AzureCosmosDBNoSQLRetriever(BaseRetriever):\n", + " \"\"\"\n", + " A custom LangChain retriever that uses Azure Cosmos DB NoSQL database for vector search.\n", + " \"\"\"\n", + " embedding_model: AzureOpenAIEmbeddings\n", + " container: ContainerProxy\n", + " model: Type[T]\n", + " vector_field_name: str\n", + " num_results: int=5\n", + "\n", + " def __get_embeddings(self, text: str) -> List[float]: \n", + " \"\"\"\n", + " Returns embeddings vector for a given text.\n", + " \"\"\"\n", + " embedding = self.embedding_model.embed_query(text) \n", + " time.sleep(0.5) # rest period to avoid rate limiting on AOAI\n", + " return embedding\n", + " \n", + " def __get_item_by_id(self, id) -> T:\n", + " \"\"\"\n", + " Retrieves a single item from the Azure Cosmos DB NoSQL database by its ID.\n", + " \"\"\"\n", + " query = \"SELECT * FROM itm WHERE itm.id = @id\"\n", + " parameters = [\n", + " {\"name\": \"@id\", \"value\": id}\n", + " ] \n", + " item = list(self.container.query_items(\n", + " query=query,\n", + " parameters=parameters,\n", + " enable_cross_partition_query=True\n", + " ))[0]\n", + " return self.model(**item)\n", + " \n", + " def __delete_attribute_by_alias(self, instance: BaseModel, alias):\n", + " for model_field in instance.model_fields:\n", + " field = instance.model_fields[model_field] \n", + " if field.alias == alias:\n", + " delattr(instance, model_field)\n", + " return\n", + " \n", + " def _get_relevant_documents(\n", + " self, query: str, *, run_manager: CallbackManagerForRetrieverRun\n", + " ) -> List[Document]:\n", + " \"\"\"\n", + " Performs a synchronous vector search on the Azure Cosmos DB NoSQL database.\n", + " \"\"\"\n", + " embedding = self.__get_embeddings(query)\n", + " items = self.container.query_items(\n", + " query=f\"\"\"SELECT TOP @num_results itm.id, VectorDistance(itm.{self.vector_field_name}, @embedding) AS SimilarityScore \n", + " FROM itm\n", + " ORDER BY VectorDistance(itm.{self.vector_field_name}, @embedding)\n", + " \"\"\",\n", + " parameters = [\n", + " { \"name\": \"@num_results\", \"value\": self.num_results },\n", + " { \"name\": \"@embedding\", \"value\": embedding } \n", + " ],\n", + " enable_cross_partition_query=True\n", + " ) \n", + " returned_docs = []\n", + " for item in items:\n", + " itm = self.__get_item_by_id(item[\"id\"]) \n", + " # Remove the vector field from the returned item so it doesn't fill the context window\n", + " self.__delete_attribute_by_alias(itm, self.vector_field_name) \n", + " returned_docs.append(Document(page_content=json.dumps(itm, indent=4, default=str), metadata={\"similarity_score\": item[\"SimilarityScore\"]}))\n", + " return returned_docs\n", + " \n", + " async def _aget_relevant_documents(\n", + " self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun\n", + " ) -> List[Document]:\n", + " \"\"\"\n", + " Performs an asynchronous vector search on the Azure Cosmos DB NoSQL database. \n", + " \"\"\"\n", + " raise Exception(f\"Asynchronous search not implemented.\")" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "products_retriever = AzureCosmosDBNoSQLRetriever(\n", + " embedding_model = embedding_model,\n", + " container = product_v_container,\n", + " model = Product,\n", + " vector_field_name = \"contentVector\",\n", + " num_results = 5 \n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"What yellow products are there?\"\n", + "products_retriever.invoke(query)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "# A system prompt describes the responsibilities, instructions, and persona of the AI.\n", + "# Note the addition of the templated variable/placeholder for the list of products and the incoming question.\n", + "system_prompt = \"\"\"\n", + "Your name is \"Willie\". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy.\n", + "\n", + "Do not include citations or citation numbers in your responses. Do not include emojis.\n", + "\n", + "Only answer questions related to the information provided in the list of products below that are represented\n", + "in JSON format.\n", + "\n", + "If you are asked a question that is not in the list, respond with \"I don't know.\"\n", + "\n", + "Only answer questions related to Cosmic Works products, customers, and sales orders.\n", + "\n", + "If a question is not related to Cosmic Works products, customers, or sales orders,\n", + "respond with \"I only answer questions about Cosmic Works\"\n", + "\n", + "List of products:\n", + "{products}\n", + "\n", + "Question:\n", + "{question}\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "# Create the prompt template from the system_prompt text\n", + "llm_prompt = PromptTemplate.from_template(system_prompt)\n", + "\n", + "rag_chain = (\n", + " # populate the tokens/placeholders in the llm_prompt \n", + " # question is a passthrough that takes the incoming question\n", + " { \"products\": products_retriever, \"question\": RunnablePassthrough()}\n", + " | llm_prompt\n", + " # pass the populated prompt to the language model\n", + " | llm\n", + " # return the string ouptut from the language model\n", + " | StrOutputParser()\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "question = \"What are the names and skus of yellow products? Output the answer as a bulleted list.\"\n", + "response = rag_chain.invoke(question)\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## LangChain Agent" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Remember, the concept of an agent is quite similar to that of a chain in LangChain but with one fundamental difference. A chain in LangChain is a hard-coded sequence of steps executed in a specific order. Conversely, an agent leverages the LLM to assess the incoming request with the current context to decide what steps or actions need to be executed and in what order.\n", + "\n", + "LangChain agents can leverage tools and toolkits. A tool can be an integration into an external system, custom code, a retriever, or even another chain. A toolkit is a collection of tools that can be used to solve a specific problem." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " ### Create Agent Tools\n", + " \n", + " LangChain does have a built-in [`create_retriever_tool`](https://python.langchain.com/docs/use_cases/question_answering/conversational_retrieval_agents#retriever-tool) that wraps a vector store retriever." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "# Create a tool that will use the product vector search in Azure Cosmos DB for NoSQL\n", + "products_retriever_tool = create_retriever_tool(\n", + " retriever = products_retriever,\n", + " name = \"vector_search_products\",\n", + " description = \"Searches Cosmic Works product information for similar products based on the question. Returns the product information in JSON format.\"\n", + ")\n", + "tools = [products_retriever_tool]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Tools part 2\n", + "\n", + "Certain properties do not have semantic meaning (such as the GUID id field) and attempting to use vector search on these fields will not yield meaningful results. We need a tool to retrieve specific documents based on popular searches criteria.\n", + "\n", + "The following tool definitions is not an exhaustive list of what may be needed but serves as an example to provide concrete lookups of a document in the Cosmic Works database." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [], + "source": [ + "# Tools helper methods\n", + "def delete_attribute_by_alias(instance: BaseModel, alias:str):\n", + " for model_field in instance.model_fields:\n", + " field = instance.model_fields[model_field] \n", + " if field.alias == alias:\n", + " delattr(instance, model_field)\n", + " return\n", + "\n", + "def get_single_item_by_field_name(\n", + " container:ContainerProxy, \n", + " field_name:str, \n", + " field_value:str, \n", + " model:Type[T]) -> T:\n", + " \"\"\"\n", + " Retrieves a single item from the Azure Cosmos DB NoSQL database by a specific field and value.\n", + " \"\"\"\n", + " query = f\"SELECT TOP 1 * FROM itm WHERE itm.{field_name} = @value\"\n", + " parameters = [\n", + " {\n", + " \"name\": \"@value\", \n", + " \"value\": field_value\n", + " }\n", + " ] \n", + " item = list(container.query_items(\n", + " query=query,\n", + " parameters=parameters,\n", + " enable_cross_partition_query=True\n", + " ))[0]\n", + " item_casted = model(**item) \n", + " return item_casted" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "def get_product_by_id(product_id: str) -> str:\n", + " \"\"\"\n", + " Retrieves a product by its ID. \n", + " \"\"\"\n", + " item = get_single_item_by_field_name(product_v_container, \"id\", product_id, Product)\n", + " delete_attribute_by_alias(item, \"contentVector\")\n", + " return json.dumps(item, indent=4, default=str) \n", + "\n", + "def get_product_by_sku(sku: str) -> str:\n", + " \"\"\"\n", + " Retrieves a product by its sku.\n", + " \"\"\"\n", + " item = get_single_item_by_field_name(product_v_container, \"sku\", sku, Product)\n", + " delete_attribute_by_alias(item, \"contentVector\")\n", + " return json.dumps(item, indent=4, default=str)\n", + " \n", + "def get_sales_by_id(sales_id: str) -> str:\n", + " \"\"\"\n", + " Retrieves a sales order by its ID.\n", + " \"\"\"\n", + " item = get_single_item_by_field_name(sales_order_container, \"id\", sales_id, SalesOrder)\n", + " return json.dumps(item, indent=4, default=str)\n", + "\n", + "tools.extend([\n", + " StructuredTool.from_function(get_product_by_id),\n", + " StructuredTool.from_function(get_product_by_sku),\n", + " StructuredTool.from_function(get_sales_by_id)\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create the agent\n", + "\n", + "The [`create_openai_functions_agent`](https://python.langchain.com/docs/use_cases/question_answering/conversational_retrieval_agents#agent-constructor) is a built-in agent that includes conversational history, tools selection, and agent scratchpad (for keeping track of the state of the progress of the LLM interaction).\n", + "\n", + "Remember that an agent leverages the LLM to assess the incoming request with the current context to decide what steps or actions need to be executed and in what order. LangChain agents can leverage tools. A tool can be an integration into an external system, custom code, or even another chain." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "agent_instructions = \"\"\" \n", + " Your name is \"Willie\". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy.\n", + " Do not include citations or citation numbers in your responses. Do not include emojis.\n", + " You are designed to answer questions about the products that Cosmic Works sells, the customers that buy them, and the sales orders that are placed by customers.\n", + " If you don't know the answer to a question, respond with \"I don't know.\" \n", + " Only answer questions related to Cosmic Works products, customers, and sales orders.\n", + " If a question is not related to Cosmic Works products, customers, or sales orders,\n", + " respond with \"I only answer questions about Cosmic Works\"\n", + " \"\"\" \n", + "\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\"system\", agent_instructions),\n", + " MessagesPlaceholder(\"chat_history\", optional=True),\n", + " (\"human\", \"{input}\"),\n", + " MessagesPlaceholder(\"agent_scratchpad\"),\n", + " ]\n", + ") \n", + "agent = create_openai_functions_agent(llm, tools, prompt)\n", + "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note**: On the following agent_executor invocations it is safe to ignore the error: `Error in StdOutCallbackHandler.on_chain_start callback: AttributeError(\"'NoneType' object has no attribute 'get'\")` - this is a defect in the verbose debug output of LangChain and does not affect the outcome of the invocation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result = agent_executor.invoke({\"input\": \"What products do you have that are yellow?\"})\n", + "print(\"***********************************************************\")\n", + "print(result['output'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result = agent_executor.invoke({\"input\": \"What products were purchased for sales order '06FE91D2-B350-471A-AD29-906BF4EB97C4' ?\"})\n", + "print(\"***********************************************************\")\n", + "print(result['output'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result = agent_executor.invoke({\"input\": \"What was the sales order total for sales order '93436616-4C8A-407D-9FDA-908707EFA2C5' ?\"})\n", + "print(\"***********************************************************\")\n", + "print(result['output'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "result = agent_executor.invoke({\"input\": \"What was the price of the product with sku `FR-R92B-58` ?\"})\n", + "print(\"***********************************************************\")\n", + "print(result['output'])" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.10" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/diskann/Labs/lab_5_backend_api.md b/diskann/Labs/lab_5_backend_api.md new file mode 100644 index 0000000..8408963 --- /dev/null +++ b/diskann/Labs/lab_5_backend_api.md @@ -0,0 +1,236 @@ +# Backend API + +In the previous lab, a LangChain agent was created armed with tools to do vector lookups and concrete document id lookups via function calling. In this lab, the agent functionality needs to be extracted into a backend api for the frontend application that will allow users to interact with the agent. + +The information provided in this section assumes that the dependent infrastructure is deployed and have completed the previous labs in this dev guide. + +## Overview + +The backend api is a Python FastAPI application that will expose endpoints for the frontend application to interact with. The backend api is a containerized application that will be deployed to [Azure Container Apps](https://learn.microsoft.com/azure/container-apps/overview). + +## Clone the Backend API + +Create a folder to house the repository. Open a terminal and navigate to the folder. Clone the repository, then navigate to the `Backend` folder within the repository. + +```bash +git clone https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide.git + +cd Azure-OpenAI-Python-Developer-Guide +cd diskann +cd Backend +``` + +## Run the backend api locally + +When developing a backend api, it is often useful to run the application locally to test and debug. This section outlines how to run the backend api locally while watching the file system for code changes. Any detected changes will automatically restart the backend api. + +1. Open the backend api folder location in VS Code. + +2. Open a **Terminal** window in VS Code (CTRL+`). + +3. Setup the `.env` file. Copy the `.env.example` file to `.env` and update the values. These are the same environment variables used in the previous labs. + + ```bash + cp .env.example .env + ``` + +4. Using the Terminal window, [create a virtual environment and activate it](https://python.land/virtual-environments/virtualenv). + +5. Run the following command to install the dependencies. + + ```bash + pip install -r requirements.txt + ``` + +6. Run the following command to start the backend api in the virtual environment. + + ```bash + uvicorn --host "0.0.0.0" --port 8000 app:app --reload + ``` + + ![The VSCode terminal window displays with the backend API started.](media/local_backend_running_console.png "Local backend api running") + +7. Open a browser and navigate to `http://localhost:8000/docs` to view the Swagger UI. + + ![The Swagger UI displays for the locally running backend api](media/local_backend_swagger_ui.png "Local backend api Swagger UI") + +8. Expand the **GET / Root** endpoint and select **Try it out**. Select **Execute** to send the request. The response should display a status of `ready`. + + ![The Swagger UI displays the GET / Root endpoint reponse that has a status of ready.](media/local_backend_swagger_ui_root_response.png "Local backend api Swagger UI Root response") + +9. Expand the **POST /ai** endpoint and select **Try it out**. In the **Request body** field, enter the following JSON. + + ```json + { + "session_id": "abc123", + "prompt": "What was the price of the product with sku `FR-R92B-58`" + } + ``` + +10. Select **Execute** to send the request. Observe that the response indicates the price as being `$1431.50`. + + ![The Swagger UI displays the POST /ai endpoint reponse that has a status of ready.](media/local_backend_swagger_ui_ai_response.png "Local backend api Swagger UI AI response") + +11. In the Terminal window, press CTRL+C to stop the backend api. + +## Build and run the backend api container locally in Docker Desktop + +When deployed to Azure, the backend api will be running in a container. It is important to test the container locally to ensure it is working as expected. Containers are important because they provide a consistent environment for the application to run in. This consistency allows the application to run the same way in development, test, and production environments - whether they be locally or in the cloud. + +The backend api contains a `Dockerfile` that defines the container image and is used by Docker to build the container image. The Dockerfile contains instructions for Docker to build the container image. The container image is a snapshot of the application and its dependencies. The container image can be thought of an installer for the application to be deployed as needed in any environment. + +The `Dockerfile` for the backend api is shown below. + +```dockerfile +FROM python:3.11 + +WORKDIR /code +COPY ./requirements.txt /code/requirements.txt +RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt +COPY . /code + +EXPOSE 80 +ENV FORWARDED_ALLOW_IPS * + +CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "80", "--forwarded-allow-ips", "*", "--proxy-headers"] +``` + +Notice the steps of installing the pip dependencies, and running the **uvicorn** command line similar to what was done in the previous section. + +1. Ensure Docker Desktop is running. + +2. Open a **Terminal** window in VS Code (CTRL+`). + +3. If the terminal displays the virtual environment name, deactivate the virtual environment by running the following command. + + ```bash + deactivate + ``` + +4. Run the following command to build the container image. Once complete, a message displays the operation has `FINISHED`. + + ```bash + docker build --pull --rm -f "DOCKERFILE" -t devguidebackendapi:latest "." + ``` + + ![The VSCode terminal window displays the docker build command and the FINISHED message.](media/local_backend_docker_build.png "Local backend api Docker build") + +5. Lastly, run the container in Docker Desktop using the following command. + + ```bash + docker run -d -p 4242:80 --name devguide-backend-api devguidebackendapi:latest + ``` + + ![The VSCode terminal window displays the docker run command.](media/local_backend_docker_run.png "Local backend api Docker run") + +6. Open a browser and navigate to `http://localhost:4242/docs` to view the Swagger UI. + +7. Repeat steps 8-10 from the previous section to test the backend api running in a container on Docker Desktop. + +## Deploy the backend api to Azure Container Apps + +### Retrieve the Azure Container Registry login server and admin credentials + +The backend api container image needs to be pushed to an Azure Container Registry (ACR) before it can be deployed to Azure Container Apps. The ACR is a private container registry that will store the container image. Azure Container Apps will pull the container image from the ACR to deploy the backend api. + +1. In the Azure portal, open the provisioned resource group and locate and open the **Container Registry** resource. + +2. Expand the **Settings** section in the left-hand menu and select **Access keys** from the left-hand menu. Record the **Login server** value and the **Username** and **Password** values for later use. + + ![The Azure portal displays the Container Registry resource with the Access keys menu item selected. The login server, username and password values are highlighted.](media/acr_access_keys.png "Container Registry Access keys") + +### Push the backend api container image to the Azure Container Registry + +Earlier, the backend api container image was built locally. Now that the ACR login server and admin credentials are known, the container image can be pushed to the Azure Container Registry. + +1. Return to the terminal window in VS Code. + +2. Run the following command to tag the container image with the ACR login server. Replace the `` value. This command will silently complete with no output. + + ```bash + docker tag devguidebackendapi:latest /devguidebackendapi:v1 + ``` + +3. Run the following command to log into the ACR. Replace the ``, ``, and `` values. The message `Login Succeeded` displays when the login is successful. + + ```bash + docker login -u -p + ``` + +4. Once authenticated, push the container image to the ACR using the following command. Replace the `` value. + + ```bash + docker push /devguidebackendapi:v1 + ``` + + ![The VSCode terminal window displays the docker login and docker push commands.](media/local_backend_docker_push.png "Local backend api Docker push") + +### Deploy the backend api ACR container image to Azure Container Apps + +The last step is to deploy the backend api container image to Azure Container Apps. Azure Container Apps is a fully managed serverless container platform that allows developers to deploy containerized applications without having to manage any infrastructure. + +1. In the Azure portal, open the provisioned resource group and locate and open the **Container App** resource, the name will end in `-api`. Record the name of this resource. + +2. Back in the resource group, locate the **Container Apps Environment** resource, the name will end in `-containerappenv`. Record the name of this resource. + +3. Also record the name of the resource group. + +4. Return to the terminal window in VS Code. + +5. Log into the Azure CLI using the following command. + + ```bash + az login + ``` + +6. Optionally set the current subscription to the correct subscription using the following command. + + ```bash + az account set --subscription + ``` + +7. Install the Azure Container Apps extension for the CLI using the following command. + + ```bash + az extension add --name containerapp --upgrade + ``` + +8. Run the following command to deploy the backend api container image to the existing Azure Container Apps resource. Replace the ``, ``, ``, and `` values. + + ```bash + az containerapp up --name --image /devguidebackendapi:v1 --resource-group --environment --ingress external + ``` + +9. In the Azure Portal, locate and open the **Container App** resource ending in `-api`. + +10. From the left menu, expand the **Application** section and select **Revisions and replicas**. Notice screen, there is a failed container revision (may need to select **Refresh** from the top toolbar menu). This is because the `hello-world` container is running at the same binding address as the backend api container. + + ![The Azure portal displays the Container App resource with a failed container revision listed.](media/container_app_failed_revision.png "Container App Failed Revision") + +11. View the error from the logs, or optionally expand the **Monitoring** section of the left menu and select **Log stream**. Be sure to select the `-api` container. + + ![The Azure portal displays the Container App resource with the Log stream menu item selected and the output of the api container with the error highlighted.](media/container_app_log_stream.png "Container App Log Stream") + +12. To rectify this, the `hello-world` container needs to be deleted. Expand the **Application** section of the left menu and select **Containers**. Choose the **Edit and Deploy** button from the toolbar. + + ![The Azure portal displays the Container App resource with the Containers menu item selected. The Edit and Deploy button is highlighted.](media/container_app_edit_and_deploy.png "Container App Edit and Deploy") + +13. On the **Create and deploy new revision** screen, beneath the **Container image** heading, check the box next to the `hello-world` container, then select **Delete**. + + ![The Azure portal displays the Create and deploy new revision screen with the hello-world container selected.](media/container_app_delete_hello_world.png "Container App Delete hello-world") + +14. Select **Create** to deploy the new revision. In less than a minute, the new revision is deployed. + +15. Refresh the browser, then from the left menu, select **Overview**. Select the **Application URL** link. + + ![The Azure portal displays the Container App resource Overview screen with the Application URL highlighted.](media/container_app_overview.png "Container App Overview") + +16. The UI should show the status as `ready`. + + ![The Azure portal displays the Container App resource with the Overview menu item selected. The Application URL is highlighted.](media/container_app_ready.png "Container App Ready") + +17. In the address bar of the browser, append `/docs` to the URL and press ENTER to view the Swagger UI. + +18. Repeat steps 8-10 from the [Run the backend api locally section](#run-the-backend-api-locally) to test the backend api running in a container on Azure Container Apps. + +Congratulations on the successful deployment of the backend api to Azure Container Apps where it is ready to service the frontend application. diff --git a/Labs/media/2024-01-06-20-01-38.png b/diskann/Labs/media/2024-01-06-20-01-38.png similarity index 100% rename from Labs/media/2024-01-06-20-01-38.png rename to diskann/Labs/media/2024-01-06-20-01-38.png diff --git a/diskann/Labs/media/acr_access_keys.png b/diskann/Labs/media/acr_access_keys.png new file mode 100644 index 0000000..461d681 Binary files /dev/null and b/diskann/Labs/media/acr_access_keys.png differ diff --git a/diskann/Labs/media/container_app_delete_hello_world.png b/diskann/Labs/media/container_app_delete_hello_world.png new file mode 100644 index 0000000..dd64d5b Binary files /dev/null and b/diskann/Labs/media/container_app_delete_hello_world.png differ diff --git a/diskann/Labs/media/container_app_edit_and_deploy.png b/diskann/Labs/media/container_app_edit_and_deploy.png new file mode 100644 index 0000000..bdd1f92 Binary files /dev/null and b/diskann/Labs/media/container_app_edit_and_deploy.png differ diff --git a/diskann/Labs/media/container_app_failed_revision.png b/diskann/Labs/media/container_app_failed_revision.png new file mode 100644 index 0000000..ed596d6 Binary files /dev/null and b/diskann/Labs/media/container_app_failed_revision.png differ diff --git a/diskann/Labs/media/container_app_log_stream.png b/diskann/Labs/media/container_app_log_stream.png new file mode 100644 index 0000000..bc1135d Binary files /dev/null and b/diskann/Labs/media/container_app_log_stream.png differ diff --git a/diskann/Labs/media/container_app_overview.png b/diskann/Labs/media/container_app_overview.png new file mode 100644 index 0000000..fb2ac28 Binary files /dev/null and b/diskann/Labs/media/container_app_overview.png differ diff --git a/diskann/Labs/media/container_app_ready.png b/diskann/Labs/media/container_app_ready.png new file mode 100644 index 0000000..cce09c1 Binary files /dev/null and b/diskann/Labs/media/container_app_ready.png differ diff --git a/diskann/Labs/media/local_backend_docker_build.png b/diskann/Labs/media/local_backend_docker_build.png new file mode 100644 index 0000000..c2b3d52 Binary files /dev/null and b/diskann/Labs/media/local_backend_docker_build.png differ diff --git a/diskann/Labs/media/local_backend_docker_push.png b/diskann/Labs/media/local_backend_docker_push.png new file mode 100644 index 0000000..c803ab5 Binary files /dev/null and b/diskann/Labs/media/local_backend_docker_push.png differ diff --git a/diskann/Labs/media/local_backend_docker_run.png b/diskann/Labs/media/local_backend_docker_run.png new file mode 100644 index 0000000..3bce36e Binary files /dev/null and b/diskann/Labs/media/local_backend_docker_run.png differ diff --git a/diskann/Labs/media/local_backend_running_console.png b/diskann/Labs/media/local_backend_running_console.png new file mode 100644 index 0000000..052c57f Binary files /dev/null and b/diskann/Labs/media/local_backend_running_console.png differ diff --git a/diskann/Labs/media/local_backend_swagger_ui.png b/diskann/Labs/media/local_backend_swagger_ui.png new file mode 100644 index 0000000..b08d72f Binary files /dev/null and b/diskann/Labs/media/local_backend_swagger_ui.png differ diff --git a/Labs/media/local_backend_swagger_ui_ai_response.png b/diskann/Labs/media/local_backend_swagger_ui_ai_response.png similarity index 100% rename from Labs/media/local_backend_swagger_ui_ai_response.png rename to diskann/Labs/media/local_backend_swagger_ui_ai_response.png diff --git a/Labs/media/local_backend_swagger_ui_root_response.png b/diskann/Labs/media/local_backend_swagger_ui_root_response.png similarity index 100% rename from Labs/media/local_backend_swagger_ui_root_response.png rename to diskann/Labs/media/local_backend_swagger_ui_root_response.png diff --git a/diskann/Labs/models/__init__.py b/diskann/Labs/models/__init__.py new file mode 100644 index 0000000..a841b4e --- /dev/null +++ b/diskann/Labs/models/__init__.py @@ -0,0 +1,11 @@ +""" +This module contains the model definitions of objects +that are present in the Cosmic Works dataset. +""" +from .tag import Tag +from .product import Product, ProductList +from .address import Address +from .password import Password +from .customer import Customer, CustomerList +from .sales_order_detail import SalesOrderDetail +from .sales_order import SalesOrder, SalesOrderList diff --git a/diskann/Labs/models/address.py b/diskann/Labs/models/address.py new file mode 100644 index 0000000..45b21a7 --- /dev/null +++ b/diskann/Labs/models/address.py @@ -0,0 +1,16 @@ +""" +Address model +""" +from pydantic import BaseModel, Field + +class Address(BaseModel): + """ + The Address class represents the structure of + an address in the Cosmic Works dataset. + """ + address_line_1: str = Field(alias="addressLine1") + address_line_2: str = Field(alias="addressLine2") + city: str + state: str + country: str + zip_code: str = Field(alias="zipCode") diff --git a/diskann/Labs/models/customer.py b/diskann/Labs/models/customer.py new file mode 100644 index 0000000..2942078 --- /dev/null +++ b/diskann/Labs/models/customer.py @@ -0,0 +1,48 @@ +""" +Customer and CustomerList models +""" +from datetime import datetime +from typing import List, Optional +from pydantic import BaseModel, Field +from .address import Address +from .password import Password + +class Customer(BaseModel): + """ + The Customer class represents a customer in the + Cosmic Works dataset. + + The alias feelds are used to map the dataset + field names to the pythonic property names. + """ + id: str = Field(alias="id") + customer_id: str = Field(alias="customerId") + title: Optional[str] + first_name: str = Field(alias="firstName") + last_name: str = Field(alias="lastName") + email_address: str = Field(alias="emailAddress") + phone_number: str = Field(alias="phoneNumber") + creation_date: datetime = Field(alias="creationDate") + addresses: List[Address] + password: Password + sales_order_count: int = Field(alias="salesOrderCount") + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + json_encoders = { + datetime: lambda v: v.isoformat() + } + +class CustomerList(BaseModel): + """ + The CustomerList class represents a list of customers. + This class is used when deserializing a container/array + of customers. + """ + items: List[Customer] diff --git a/diskann/Labs/models/password.py b/diskann/Labs/models/password.py new file mode 100644 index 0000000..f7836f5 --- /dev/null +++ b/diskann/Labs/models/password.py @@ -0,0 +1,12 @@ +""" +Password model +""" +from pydantic import BaseModel + +class Password(BaseModel): + """ + The Password class represents the structure of + a password in the Cosmic Works dataset. + """ + hash: str + salt: str diff --git a/diskann/Labs/models/product.py b/diskann/Labs/models/product.py new file mode 100644 index 0000000..6613df5 --- /dev/null +++ b/diskann/Labs/models/product.py @@ -0,0 +1,38 @@ +""" +Product model +""" +from typing import List, Optional +from pydantic import BaseModel, Field +from .tag import Tag + +class Product(BaseModel): + """ + The Product class represents a product in the + Cosmic Works dataset. + """ + id: str = Field(default=None, alias="id") + category_id: str = Field(alias="categoryId") + category_name: str = Field(alias="categoryName") + sku: str + name: str + description: str + price: float + tags: Optional[List[Tag]] = [] + content_vector: Optional[List[float]] = Field(default=[], alias="contentVector") + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + +class ProductList(BaseModel): + """ + The ProductList class represents a list of products. + This class is used when deserializing a collection/array + of products. + """ + items: List[Product] diff --git a/diskann/Labs/models/sales_order.py b/diskann/Labs/models/sales_order.py new file mode 100644 index 0000000..7b1bb8b --- /dev/null +++ b/diskann/Labs/models/sales_order.py @@ -0,0 +1,39 @@ +""" +SalesOrder model +""" +from datetime import datetime +from typing import List +from pydantic import BaseModel, Field +from .sales_order_detail import SalesOrderDetail + +class SalesOrder(BaseModel): + """ + The SalesOrder class represents a sales order in the + Cosmic Works dataset. + """ + id: str = Field(alias="id") + customer_id: str = Field(alias="customerId") + order_date: datetime = Field(alias="orderDate") + ship_date: datetime = Field(alias="shipDate") + details: List[SalesOrderDetail] + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True + json_encoders = { + datetime: lambda v: v.isoformat() + } + +class SalesOrderList(BaseModel): + """ + The SalesOrderList class represents a list of sales orders. + + This class is used when deserializing a container/array + of sales orders. + """ + items: List[SalesOrder] diff --git a/diskann/Labs/models/sales_order_detail.py b/diskann/Labs/models/sales_order_detail.py new file mode 100644 index 0000000..97ac275 --- /dev/null +++ b/diskann/Labs/models/sales_order_detail.py @@ -0,0 +1,14 @@ +""" +SalesOrderDetail model +""" +from pydantic import BaseModel + +class SalesOrderDetail(BaseModel): + """ + The SalesOrderDetail class represents invoice line items + for the Sales Order in the Cosmic Works dataset. + """ + sku: str + name: str + price: float + quantity: int diff --git a/diskann/Labs/models/tag.py b/diskann/Labs/models/tag.py new file mode 100644 index 0000000..c208693 --- /dev/null +++ b/diskann/Labs/models/tag.py @@ -0,0 +1,23 @@ +""" +Tag model +""" +from pydantic import BaseModel, Field + +class Tag(BaseModel): + """ + The Tag class represents a tag in the + Cosmic Works dataset. + + Tags are metadata associated with a product. + """ + id: str = Field(default=None, alias="_id") + name: str + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True diff --git a/diskann/Labs/requirements.txt b/diskann/Labs/requirements.txt new file mode 100644 index 0000000..89479ec --- /dev/null +++ b/diskann/Labs/requirements.txt @@ -0,0 +1,11 @@ +azure-cosmos==4.7.0 +python-dotenv==1.0.1 +requests==2.32.3 +pydantic==2.9.1 +openai==1.45.0 +tenacity==8.5.0 +langchain==0.3.0 +langchain-openai==0.2.0 +tiktoken==0.7.0 +fastapi==0.114.2 +uvicorn==0.30.6 \ No newline at end of file diff --git a/diskann/README.md b/diskann/README.md new file mode 100644 index 0000000..060e4fd --- /dev/null +++ b/diskann/README.md @@ -0,0 +1,18 @@ +# Azure Cosmos DB for NoSQL + DiskANN + Azure OpenAI Python Developer Guide + +1. [Introduction](00_Introduction/README.md) +1. [Azure Overview](01_Azure_Overview/README.md) +1. [Overview of Azure Cosmos DB](02_Overview_Cosmos_DB/README.md) +1. [Overview of Azure OpenAI](03_Overview_Azure_OpenAI/README.md) +1. [Overview of AI Concepts](04_Overview_AI_Concepts/README.md) +1. [Explore the Azure OpenAI models and endpoints (console app)](05_Explore_OpenAI_models/README.md) +1. [Provision Azure resources](06_Provision_Azure_Resources/README.md) +1. [Create your first Azure Cosmos DB project](07_Create_First_Cosmos_DB_Project/README.md) +1. [Load data into Azure Cosmos DB NoSQL API](08_Load_Data/README.md) +1. [Use vector search on embeddings in Azure Cosmos DB for NoSQL](09_Vector_Search_Cosmos_DB/README.md) +1. [LangChain](10_LangChain/README.md) +1. [Backend API](11_Backend_API/README.md) +1. [Connect the chat user interface with the chatbot API](12_User_Interface/README.md) +1. [Conclusion](13_Conclusion/README.md) + +![Azure Cosmos DB for NoSql + DiskANN + Azure OpenAI Python Developer Guide Architecture Diagram](06_Provision_Azure_Resources/media/architecture.jpg) diff --git a/diskann/assets/Graphics.pptx b/diskann/assets/Graphics.pptx new file mode 100644 index 0000000..651ed5d Binary files /dev/null and b/diskann/assets/Graphics.pptx differ diff --git a/diskann/assets/architecture.pptx b/diskann/assets/architecture.pptx new file mode 100644 index 0000000..541b99d Binary files /dev/null and b/diskann/assets/architecture.pptx differ diff --git a/00_Introduction/README.md b/vcore/00_Introduction/README.md similarity index 100% rename from 00_Introduction/README.md rename to vcore/00_Introduction/README.md diff --git a/01_Azure_Overview/README.md b/vcore/01_Azure_Overview/README.md similarity index 100% rename from 01_Azure_Overview/README.md rename to vcore/01_Azure_Overview/README.md diff --git a/vcore/01_Azure_Overview/media/2023-12-31-14-19-28.png b/vcore/01_Azure_Overview/media/2023-12-31-14-19-28.png new file mode 100644 index 0000000..60e32b2 Binary files /dev/null and b/vcore/01_Azure_Overview/media/2023-12-31-14-19-28.png differ diff --git a/vcore/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png b/vcore/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png new file mode 100644 index 0000000..57a8ffd Binary files /dev/null and b/vcore/01_Azure_Overview/media/ISV-Tech-Builders-tools-white.png differ diff --git a/vcore/01_Azure_Overview/media/azure-cli-example.png b/vcore/01_Azure_Overview/media/azure-cli-example.png new file mode 100644 index 0000000..57072ac Binary files /dev/null and b/vcore/01_Azure_Overview/media/azure-cli-example.png differ diff --git a/vcore/01_Azure_Overview/media/azure-management-tool-maturity.png b/vcore/01_Azure_Overview/media/azure-management-tool-maturity.png new file mode 100644 index 0000000..7835d8f Binary files /dev/null and b/vcore/01_Azure_Overview/media/azure-management-tool-maturity.png differ diff --git a/01_Azure_Overview/media/azure-marketplace-search-results.png b/vcore/01_Azure_Overview/media/azure-marketplace-search-results.png similarity index 100% rename from 01_Azure_Overview/media/azure-marketplace-search-results.png rename to vcore/01_Azure_Overview/media/azure-marketplace-search-results.png diff --git a/vcore/01_Azure_Overview/media/azure-portal-services.png b/vcore/01_Azure_Overview/media/azure-portal-services.png new file mode 100644 index 0000000..a1d2816 Binary files /dev/null and b/vcore/01_Azure_Overview/media/azure-portal-services.png differ diff --git a/vcore/01_Azure_Overview/media/azure-services.png b/vcore/01_Azure_Overview/media/azure-services.png new file mode 100644 index 0000000..5aa4cf2 Binary files /dev/null and b/vcore/01_Azure_Overview/media/azure-services.png differ diff --git a/01_Azure_Overview/media/azure-template-json-example.png b/vcore/01_Azure_Overview/media/azure-template-json-example.png similarity index 100% rename from 01_Azure_Overview/media/azure-template-json-example.png rename to vcore/01_Azure_Overview/media/azure-template-json-example.png diff --git a/01_Azure_Overview/media/bicep_code.png b/vcore/01_Azure_Overview/media/bicep_code.png similarity index 100% rename from 01_Azure_Overview/media/bicep_code.png rename to vcore/01_Azure_Overview/media/bicep_code.png diff --git a/vcore/01_Azure_Overview/media/consistent-management-layer.png b/vcore/01_Azure_Overview/media/consistent-management-layer.png new file mode 100644 index 0000000..73235de Binary files /dev/null and b/vcore/01_Azure_Overview/media/consistent-management-layer.png differ diff --git a/vcore/01_Azure_Overview/media/landing-zone-accelerator.png b/vcore/01_Azure_Overview/media/landing-zone-accelerator.png new file mode 100644 index 0000000..a385076 Binary files /dev/null and b/vcore/01_Azure_Overview/media/landing-zone-accelerator.png differ diff --git a/vcore/01_Azure_Overview/media/scope-levels.png b/vcore/01_Azure_Overview/media/scope-levels.png new file mode 100644 index 0000000..d28ef0a Binary files /dev/null and b/vcore/01_Azure_Overview/media/scope-levels.png differ diff --git a/01_Azure_Overview/media/terraform_code.png b/vcore/01_Azure_Overview/media/terraform_code.png similarity index 100% rename from 01_Azure_Overview/media/terraform_code.png rename to vcore/01_Azure_Overview/media/terraform_code.png diff --git a/02_Overview_Cosmos_DB/README.md b/vcore/02_Overview_Cosmos_DB/README.md similarity index 100% rename from 02_Overview_Cosmos_DB/README.md rename to vcore/02_Overview_Cosmos_DB/README.md diff --git a/03_Overview_Azure_OpenAI/README.md b/vcore/03_Overview_Azure_OpenAI/README.md similarity index 100% rename from 03_Overview_Azure_OpenAI/README.md rename to vcore/03_Overview_Azure_OpenAI/README.md diff --git a/03_Overview_Azure_OpenAI/images/2024-01-23-17-52-46.png b/vcore/03_Overview_Azure_OpenAI/images/2024-01-23-17-52-46.png similarity index 100% rename from 03_Overview_Azure_OpenAI/images/2024-01-23-17-52-46.png rename to vcore/03_Overview_Azure_OpenAI/images/2024-01-23-17-52-46.png diff --git a/04_Overview_AI_Concepts/README.md b/vcore/04_Overview_AI_Concepts/README.md similarity index 100% rename from 04_Overview_AI_Concepts/README.md rename to vcore/04_Overview_AI_Concepts/README.md diff --git a/05_Explore_OpenAI_models/README.md b/vcore/05_Explore_OpenAI_models/README.md similarity index 97% rename from 05_Explore_OpenAI_models/README.md rename to vcore/05_Explore_OpenAI_models/README.md index fbe3b91..2c357c3 100644 --- a/05_Explore_OpenAI_models/README.md +++ b/vcore/05_Explore_OpenAI_models/README.md @@ -68,4 +68,4 @@ This labs demonstrates using an Azure OpenAI model to obtain a completion respon >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_0_explore_and_use_models.ipynb). +Visit the lab repository to complete [this lab](/vcore/Labs/lab_0_explore_and_use_models.ipynb). diff --git a/vcore/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png b/vcore/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png new file mode 100644 index 0000000..0fe13f2 Binary files /dev/null and b/vcore/05_Explore_OpenAI_models/media/2024-01-09-13-53-51.png differ diff --git a/06_Provision_Azure_Resources/README.md b/vcore/06_Provision_Azure_Resources/README.md similarity index 92% rename from 06_Provision_Azure_Resources/README.md rename to vcore/06_Provision_Azure_Resources/README.md index 71c85b7..8cae6db 100644 --- a/06_Provision_Azure_Resources/README.md +++ b/vcore/06_Provision_Azure_Resources/README.md @@ -25,4 +25,4 @@ This lab will walk you through deploying the Azure resources necessary for the s > **Note**: You will need an Azure Subscription and have the necessary permissions to provision the Azure resources. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/deploy/deploy.md). +Please visit the lab repository to complete [this lab](/vcore/Labs/deploy/deploy.md). diff --git a/06_Provision_Azure_Resources/media/architecture.jpg b/vcore/06_Provision_Azure_Resources/media/architecture.jpg similarity index 100% rename from 06_Provision_Azure_Resources/media/architecture.jpg rename to vcore/06_Provision_Azure_Resources/media/architecture.jpg diff --git a/07_Create_First_Cosmos_DB_Project/README.md b/vcore/07_Create_First_Cosmos_DB_Project/README.md similarity index 97% rename from 07_Create_First_Cosmos_DB_Project/README.md rename to vcore/07_Create_First_Cosmos_DB_Project/README.md index bf24c50..c694155 100644 --- a/07_Create_First_Cosmos_DB_Project/README.md +++ b/vcore/07_Create_First_Cosmos_DB_Project/README.md @@ -38,7 +38,7 @@ Using a notebook, we'll create a Cosmos DB for the MongoDB application in this l >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_1_first_application.ipynb). +Please visit the lab repository to complete [this lab](/vcore/Labs/lab_1_first_application.ipynb). The following concepts are covered in detail in this lab: diff --git a/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png b/vcore/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png similarity index 100% rename from 07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png rename to vcore/07_Create_First_Cosmos_DB_Project/media/azure_connection_string.png diff --git a/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png b/vcore/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png similarity index 100% rename from 07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png rename to vcore/07_Create_First_Cosmos_DB_Project/media/emulator_connection_string.png diff --git a/08_Load_Data/README.md b/vcore/08_Load_Data/README.md similarity index 94% rename from 08_Load_Data/README.md rename to vcore/08_Load_Data/README.md index ce5a9ba..4627ad4 100644 --- a/08_Load_Data/README.md +++ b/vcore/08_Load_Data/README.md @@ -18,7 +18,7 @@ This lab will load the Cosmic Works Customer, Product, and Sales data into Azure >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_2_load_data.ipynb). +Please visit the lab repository to complete [this lab](/vcore/Labs/lab_2_load_data.ipynb). This lab demonstrates the use of bulk operations to load product, customer, and sales data into Azure Cosmos DB API for MongoDB collections. As an example, the following code snippet inserts product data using the `bulk_write` method allowing for upsert functionality using the `UpdateOne` method: diff --git a/09_Vector_Search_Cosmos_DB/README.md b/vcore/09_Vector_Search_Cosmos_DB/README.md similarity index 98% rename from 09_Vector_Search_Cosmos_DB/README.md rename to vcore/09_Vector_Search_Cosmos_DB/README.md index 6440d21..7960589 100644 --- a/09_Vector_Search_Cosmos_DB/README.md +++ b/vcore/09_Vector_Search_Cosmos_DB/README.md @@ -42,7 +42,7 @@ This lab also requires the data provided in the previous lab titled [Load data i >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_3_mongodb_vector_search.ipynb). +Please visit the lab repository to complete [this lab](/vcore/Labs/lab_3_mongodb_vector_search.ipynb). Some highlights from the lab include: diff --git a/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png b/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png new file mode 100644 index 0000000..859306f Binary files /dev/null and b/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_settings.png differ diff --git a/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png b/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png new file mode 100644 index 0000000..3f94ac7 Binary files /dev/null and b/vcore/09_Vector_Search_Cosmos_DB/media/azure_openai_studio_settings_icon.png differ diff --git a/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png b/vcore/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png similarity index 100% rename from 09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png rename to vcore/09_Vector_Search_Cosmos_DB/media/embedding_pipeline.png diff --git a/vcore/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png b/vcore/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png new file mode 100644 index 0000000..97c5c31 Binary files /dev/null and b/vcore/09_Vector_Search_Cosmos_DB/media/vector_search_flow.png differ diff --git a/10_LangChain/README.md b/vcore/10_LangChain/README.md similarity index 98% rename from 10_LangChain/README.md rename to vcore/10_LangChain/README.md index ca6c577..410c6b1 100644 --- a/10_LangChain/README.md +++ b/vcore/10_LangChain/README.md @@ -26,7 +26,7 @@ This lab also requires the data provided in the previous lab titled [Load data i >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_4_langchain.ipynb). +Please visit the lab repository to complete [this lab](/vcore/Labs/lab_4_langchain.ipynb). Some highlights of the lab include: diff --git a/vcore/10_LangChain/media/langchain_rag.png b/vcore/10_LangChain/media/langchain_rag.png new file mode 100644 index 0000000..6e95516 Binary files /dev/null and b/vcore/10_LangChain/media/langchain_rag.png differ diff --git a/11_Backend_API/README.md b/vcore/11_Backend_API/README.md similarity index 89% rename from 11_Backend_API/README.md rename to vcore/11_Backend_API/README.md index e522144..0673405 100644 --- a/11_Backend_API/README.md +++ b/vcore/11_Backend_API/README.md @@ -8,4 +8,4 @@ This lab also requires the data provided in the previous lab titled [Load data i >**Note**: It is highly recommended to use a [virtual environment](https://python.land/virtual-environments/virtualenv) for all labs. -Please visit the lab repository to complete [this lab](https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide/blob/main/Labs/lab_4_langchain.ipynb). +Please visit the lab repository to complete [this lab](/vcore/Labs/lab_4_langchain.ipynb). diff --git a/vcore/11_Backend_API/media/2024-01-06-20-01-38.png b/vcore/11_Backend_API/media/2024-01-06-20-01-38.png new file mode 100644 index 0000000..6c8be9e Binary files /dev/null and b/vcore/11_Backend_API/media/2024-01-06-20-01-38.png differ diff --git a/vcore/11_Backend_API/media/acr_access_keys.png b/vcore/11_Backend_API/media/acr_access_keys.png new file mode 100644 index 0000000..4a239f1 Binary files /dev/null and b/vcore/11_Backend_API/media/acr_access_keys.png differ diff --git a/vcore/11_Backend_API/media/container_app_delete_hello_world.png b/vcore/11_Backend_API/media/container_app_delete_hello_world.png new file mode 100644 index 0000000..d0e6b42 Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_delete_hello_world.png differ diff --git a/vcore/11_Backend_API/media/container_app_edit_and_deploy.png b/vcore/11_Backend_API/media/container_app_edit_and_deploy.png new file mode 100644 index 0000000..eff03b0 Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_edit_and_deploy.png differ diff --git a/vcore/11_Backend_API/media/container_app_failed_revision.png b/vcore/11_Backend_API/media/container_app_failed_revision.png new file mode 100644 index 0000000..e356ee4 Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_failed_revision.png differ diff --git a/vcore/11_Backend_API/media/container_app_log_stream.png b/vcore/11_Backend_API/media/container_app_log_stream.png new file mode 100644 index 0000000..f894a72 Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_log_stream.png differ diff --git a/vcore/11_Backend_API/media/container_app_overview.png b/vcore/11_Backend_API/media/container_app_overview.png new file mode 100644 index 0000000..4707a9c Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_overview.png differ diff --git a/vcore/11_Backend_API/media/container_app_ready.png b/vcore/11_Backend_API/media/container_app_ready.png new file mode 100644 index 0000000..dfe5987 Binary files /dev/null and b/vcore/11_Backend_API/media/container_app_ready.png differ diff --git a/vcore/11_Backend_API/media/container_deploy.png b/vcore/11_Backend_API/media/container_deploy.png new file mode 100644 index 0000000..d5d6e71 Binary files /dev/null and b/vcore/11_Backend_API/media/container_deploy.png differ diff --git a/vcore/11_Backend_API/media/local_backend_docker_build.png b/vcore/11_Backend_API/media/local_backend_docker_build.png new file mode 100644 index 0000000..d763378 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_docker_build.png differ diff --git a/vcore/11_Backend_API/media/local_backend_docker_push.png b/vcore/11_Backend_API/media/local_backend_docker_push.png new file mode 100644 index 0000000..c0ee855 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_docker_push.png differ diff --git a/vcore/11_Backend_API/media/local_backend_docker_run.png b/vcore/11_Backend_API/media/local_backend_docker_run.png new file mode 100644 index 0000000..bd6a694 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_docker_run.png differ diff --git a/vcore/11_Backend_API/media/local_backend_running_console.png b/vcore/11_Backend_API/media/local_backend_running_console.png new file mode 100644 index 0000000..d6ff9d6 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_running_console.png differ diff --git a/vcore/11_Backend_API/media/local_backend_swagger_ui.png b/vcore/11_Backend_API/media/local_backend_swagger_ui.png new file mode 100644 index 0000000..54a92cd Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_swagger_ui.png differ diff --git a/vcore/11_Backend_API/media/local_backend_swagger_ui_ai_response.png b/vcore/11_Backend_API/media/local_backend_swagger_ui_ai_response.png new file mode 100644 index 0000000..1e02142 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_swagger_ui_ai_response.png differ diff --git a/vcore/11_Backend_API/media/local_backend_swagger_ui_root_response.png b/vcore/11_Backend_API/media/local_backend_swagger_ui_root_response.png new file mode 100644 index 0000000..dbe8193 Binary files /dev/null and b/vcore/11_Backend_API/media/local_backend_swagger_ui_root_response.png differ diff --git a/12_User_Interface/README.md b/vcore/12_User_Interface/README.md similarity index 100% rename from 12_User_Interface/README.md rename to vcore/12_User_Interface/README.md diff --git a/12_User_Interface/images/2024-01-17-12-41-48.png b/vcore/12_User_Interface/images/2024-01-17-12-41-48.png similarity index 100% rename from 12_User_Interface/images/2024-01-17-12-41-48.png rename to vcore/12_User_Interface/images/2024-01-17-12-41-48.png diff --git a/12_User_Interface/images/2024-01-17-12-42-59.png b/vcore/12_User_Interface/images/2024-01-17-12-42-59.png similarity index 100% rename from 12_User_Interface/images/2024-01-17-12-42-59.png rename to vcore/12_User_Interface/images/2024-01-17-12-42-59.png diff --git a/12_User_Interface/images/2024-01-17-12-45-30.png b/vcore/12_User_Interface/images/2024-01-17-12-45-30.png similarity index 100% rename from 12_User_Interface/images/2024-01-17-12-45-30.png rename to vcore/12_User_Interface/images/2024-01-17-12-45-30.png diff --git a/12_User_Interface/images/2024-01-17-12-53-13.png b/vcore/12_User_Interface/images/2024-01-17-12-53-13.png similarity index 100% rename from 12_User_Interface/images/2024-01-17-12-53-13.png rename to vcore/12_User_Interface/images/2024-01-17-12-53-13.png diff --git a/13_Conclusion/README.md b/vcore/13_Conclusion/README.md similarity index 100% rename from 13_Conclusion/README.md rename to vcore/13_Conclusion/README.md diff --git a/Backend/.env.EXAMPLE b/vcore/Backend/.env.EXAMPLE similarity index 100% rename from Backend/.env.EXAMPLE rename to vcore/Backend/.env.EXAMPLE diff --git a/Backend/.gitignore b/vcore/Backend/.gitignore similarity index 100% rename from Backend/.gitignore rename to vcore/Backend/.gitignore diff --git a/vcore/Backend/DOCKERFILE b/vcore/Backend/DOCKERFILE new file mode 100644 index 0000000..2919d22 --- /dev/null +++ b/vcore/Backend/DOCKERFILE @@ -0,0 +1,11 @@ +FROM python:3.11 + +WORKDIR /code +COPY ./requirements.txt /code/requirements.txt +RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt +COPY . /code + +EXPOSE 80 +ENV FORWARDED_ALLOW_IPS * + +CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "80", "--forwarded-allow-ips", "*", "--proxy-headers"] diff --git a/vcore/Backend/README.md b/vcore/Backend/README.md new file mode 100644 index 0000000..77e80cf --- /dev/null +++ b/vcore/Backend/README.md @@ -0,0 +1 @@ +# cosmos-db-dev-guide-backend-app-python diff --git a/vcore/Backend/api_models/ai_request.py b/vcore/Backend/api_models/ai_request.py new file mode 100644 index 0000000..31899ea --- /dev/null +++ b/vcore/Backend/api_models/ai_request.py @@ -0,0 +1,13 @@ +""" +AIRequest model +""" +from pydantic import BaseModel + +class AIRequest(BaseModel): + """ + AIRequest model encapsulates the session_id + and incoming user prompt for the AI agent + to respond to. + """ + session_id: str + prompt: str diff --git a/Backend/app.py b/vcore/Backend/app.py similarity index 100% rename from Backend/app.py rename to vcore/Backend/app.py diff --git a/Backend/cosmic_works/cosmic_works_ai_agent.py b/vcore/Backend/cosmic_works/cosmic_works_ai_agent.py similarity index 100% rename from Backend/cosmic_works/cosmic_works_ai_agent.py rename to vcore/Backend/cosmic_works/cosmic_works_ai_agent.py diff --git a/Backend/requirements.txt b/vcore/Backend/requirements.txt similarity index 100% rename from Backend/requirements.txt rename to vcore/Backend/requirements.txt diff --git a/Labs/.env.EXAMPLE b/vcore/Labs/.env.EXAMPLE similarity index 100% rename from Labs/.env.EXAMPLE rename to vcore/Labs/.env.EXAMPLE diff --git a/vcore/Labs/.gitignore b/vcore/Labs/.gitignore new file mode 100644 index 0000000..1f26bf8 --- /dev/null +++ b/vcore/Labs/.gitignore @@ -0,0 +1,166 @@ +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class + +# C extensions +*.so + +# Distribution / packaging +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +share/python-wheels/ +*.egg-info/ +.installed.cfg +*.egg +MANIFEST + +# PyInstaller +# Usually these files are written by a python script from a template +# before PyInstaller builds the exe, so as to inject date/other infos into it. +*.manifest +*.spec + +# Installer logs +pip-log.txt +pip-delete-this-directory.txt + +# Unit test / coverage reports +htmlcov/ +.tox/ +.nox/ +.coverage +.coverage.* +.cache +nosetests.xml +coverage.xml +*.cover +*.py,cover +.hypothesis/ +.pytest_cache/ +cover/ + +# Translations +*.mo +*.pot + +# Django stuff: +*.log +local_settings.py +db.sqlite3 +db.sqlite3-journal + +# Flask stuff: +instance/ +.webassets-cache + +# Scrapy stuff: +.scrapy + +# Sphinx documentation +docs/_build/ + +# PyBuilder +.pybuilder/ +target/ + +# Jupyter Notebook +.ipynb_checkpoints + +# IPython +profile_default/ +ipython_config.py + +# pyenv +# For a library or package, you might want to ignore these files since the code is +# intended to run in multiple environments; otherwise, check them in: +# .python-version + +# pipenv +# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. +# However, in case of collaboration, if having platform-specific dependencies or dependencies +# having no cross-platform support, pipenv may install dependencies that don't work, or not +# install all needed dependencies. +#Pipfile.lock + +# poetry +# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. +# This is especially recommended for binary packages to ensure reproducibility, and is more +# commonly ignored for libraries. +# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control +#poetry.lock + +# pdm +# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. +#pdm.lock +# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it +# in version control. +# https://pdm.fming.dev/#use-with-ide +.pdm.toml + +# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm +__pypackages__/ + +# Celery stuff +celerybeat-schedule +celerybeat.pid + +# SageMath parsed files +*.sage.py + +# Environments +.env +.venv +env/ +venv/ +ENV/ +env.bak/ +venv.bak/ + +# Spyder project settings +.spyderproject +.spyproject + +# Rope project settings +.ropeproject + +# mkdocs documentation +/site + +# mypy +.mypy_cache/ +.dmypy.json +dmypy.json + +# Pyre type checker +.pyre/ + +# pytype static type analyzer +.pytype/ + +# Cython debug symbols +cython_debug/ + +# PyCharm +# JetBrains specific template is maintained in a separate JetBrains.gitignore that can +# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore +# and can be added to the global gitignore or merged into this file. For a more nuclear +# option (not recommended) you can uncomment the following to ignore the entire idea folder. +#.idea/ + + +# Node.js +node_modules/ + +.DS_Store diff --git a/Labs/README.md b/vcore/Labs/README.md similarity index 100% rename from Labs/README.md rename to vcore/Labs/README.md diff --git a/Labs/deploy/azuredeploy.bicep b/vcore/Labs/deploy/azuredeploy.bicep similarity index 99% rename from Labs/deploy/azuredeploy.bicep rename to vcore/Labs/deploy/azuredeploy.bicep index ee04c74..85c2559 100644 --- a/Labs/deploy/azuredeploy.bicep +++ b/vcore/Labs/deploy/azuredeploy.bicep @@ -2,7 +2,7 @@ Azure Cosmos DB + Azure OpenAI Python developer guide lab ****************************************************************** This Azure resource deployment template uses some of the following practices: -- [Abbrevation examples for Azure resources](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/resource-abbreviations) +- [Abbrevation examples for Azure resources](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/azure-best-practices/resource-abbreviations) */ diff --git a/Labs/deploy/azuredeploy.parameters.json b/vcore/Labs/deploy/azuredeploy.parameters.json similarity index 100% rename from Labs/deploy/azuredeploy.parameters.json rename to vcore/Labs/deploy/azuredeploy.parameters.json diff --git a/Labs/deploy/deploy.md b/vcore/Labs/deploy/deploy.md similarity index 99% rename from Labs/deploy/deploy.md rename to vcore/Labs/deploy/deploy.md index 681548b..8749dda 100644 --- a/Labs/deploy/deploy.md +++ b/vcore/Labs/deploy/deploy.md @@ -15,6 +15,7 @@ Create a folder to house the repository. Open a terminal and navigate to the fol git clone https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide.git cd Azure-OpenAI-Python-Developer-Guide +cd vcore cd Labs cd deploy ``` diff --git a/Labs/deploy/images/editor-azuredeploy-parameters-json-password.png b/vcore/Labs/deploy/images/editor-azuredeploy-parameters-json-password.png similarity index 100% rename from Labs/deploy/images/editor-azuredeploy-parameters-json-password.png rename to vcore/Labs/deploy/images/editor-azuredeploy-parameters-json-password.png diff --git a/Labs/lab_0_explore_and_use_models.ipynb b/vcore/Labs/lab_0_explore_and_use_models.ipynb similarity index 100% rename from Labs/lab_0_explore_and_use_models.ipynb rename to vcore/Labs/lab_0_explore_and_use_models.ipynb diff --git a/Labs/lab_1_first_application.ipynb b/vcore/Labs/lab_1_first_application.ipynb similarity index 100% rename from Labs/lab_1_first_application.ipynb rename to vcore/Labs/lab_1_first_application.ipynb diff --git a/Labs/lab_2_load_data.ipynb b/vcore/Labs/lab_2_load_data.ipynb similarity index 100% rename from Labs/lab_2_load_data.ipynb rename to vcore/Labs/lab_2_load_data.ipynb diff --git a/Labs/lab_3_mongodb_vector_search.ipynb b/vcore/Labs/lab_3_mongodb_vector_search.ipynb similarity index 100% rename from Labs/lab_3_mongodb_vector_search.ipynb rename to vcore/Labs/lab_3_mongodb_vector_search.ipynb diff --git a/Labs/lab_4_langchain.ipynb b/vcore/Labs/lab_4_langchain.ipynb similarity index 100% rename from Labs/lab_4_langchain.ipynb rename to vcore/Labs/lab_4_langchain.ipynb diff --git a/Labs/lab_5_backend_api.md b/vcore/Labs/lab_5_backend_api.md similarity index 99% rename from Labs/lab_5_backend_api.md rename to vcore/Labs/lab_5_backend_api.md index cc3b377..1d648aa 100644 --- a/Labs/lab_5_backend_api.md +++ b/vcore/Labs/lab_5_backend_api.md @@ -6,7 +6,7 @@ The information provided in this section assumes that the dependent infrastructu ## Overview -The backend api is a Python FastAPI application that will expose endpoints for the frontend application to interact with. The backend api is a containerized application that will be deployed to [Azure Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/overview). +The backend api is a Python FastAPI application that will expose endpoints for the frontend application to interact with. The backend api is a containerized application that will be deployed to [Azure Container Apps](https://learn.microsoft.com/azure/container-apps/overview). ## Clone the Backend API @@ -16,6 +16,7 @@ Create a folder to house the repository. Open a terminal and navigate to the fol git clone https://github.com/AzureCosmosDB/Azure-OpenAI-Python-Developer-Guide.git cd Azure-OpenAI-Python-Developer-Guide +cd vcore cd Backend ``` diff --git a/vcore/Labs/media/2024-01-06-20-01-38.png b/vcore/Labs/media/2024-01-06-20-01-38.png new file mode 100644 index 0000000..bba2c12 Binary files /dev/null and b/vcore/Labs/media/2024-01-06-20-01-38.png differ diff --git a/Labs/media/acr_access_keys.png b/vcore/Labs/media/acr_access_keys.png similarity index 100% rename from Labs/media/acr_access_keys.png rename to vcore/Labs/media/acr_access_keys.png diff --git a/Labs/media/container_app_delete_hello_world.png b/vcore/Labs/media/container_app_delete_hello_world.png similarity index 100% rename from Labs/media/container_app_delete_hello_world.png rename to vcore/Labs/media/container_app_delete_hello_world.png diff --git a/Labs/media/container_app_edit_and_deploy.png b/vcore/Labs/media/container_app_edit_and_deploy.png similarity index 100% rename from Labs/media/container_app_edit_and_deploy.png rename to vcore/Labs/media/container_app_edit_and_deploy.png diff --git a/Labs/media/container_app_failed_revision.png b/vcore/Labs/media/container_app_failed_revision.png similarity index 100% rename from Labs/media/container_app_failed_revision.png rename to vcore/Labs/media/container_app_failed_revision.png diff --git a/Labs/media/container_app_log_stream.png b/vcore/Labs/media/container_app_log_stream.png similarity index 100% rename from Labs/media/container_app_log_stream.png rename to vcore/Labs/media/container_app_log_stream.png diff --git a/Labs/media/container_app_overview.png b/vcore/Labs/media/container_app_overview.png similarity index 100% rename from Labs/media/container_app_overview.png rename to vcore/Labs/media/container_app_overview.png diff --git a/Labs/media/container_app_ready.png b/vcore/Labs/media/container_app_ready.png similarity index 100% rename from Labs/media/container_app_ready.png rename to vcore/Labs/media/container_app_ready.png diff --git a/Labs/media/container_deploy.png b/vcore/Labs/media/container_deploy.png similarity index 100% rename from Labs/media/container_deploy.png rename to vcore/Labs/media/container_deploy.png diff --git a/Labs/media/local_backend_docker_build.png b/vcore/Labs/media/local_backend_docker_build.png similarity index 100% rename from Labs/media/local_backend_docker_build.png rename to vcore/Labs/media/local_backend_docker_build.png diff --git a/Labs/media/local_backend_docker_push.png b/vcore/Labs/media/local_backend_docker_push.png similarity index 100% rename from Labs/media/local_backend_docker_push.png rename to vcore/Labs/media/local_backend_docker_push.png diff --git a/Labs/media/local_backend_docker_run.png b/vcore/Labs/media/local_backend_docker_run.png similarity index 100% rename from Labs/media/local_backend_docker_run.png rename to vcore/Labs/media/local_backend_docker_run.png diff --git a/Labs/media/local_backend_running_console.png b/vcore/Labs/media/local_backend_running_console.png similarity index 100% rename from Labs/media/local_backend_running_console.png rename to vcore/Labs/media/local_backend_running_console.png diff --git a/Labs/media/local_backend_swagger_ui.png b/vcore/Labs/media/local_backend_swagger_ui.png similarity index 100% rename from Labs/media/local_backend_swagger_ui.png rename to vcore/Labs/media/local_backend_swagger_ui.png diff --git a/vcore/Labs/media/local_backend_swagger_ui_ai_response.png b/vcore/Labs/media/local_backend_swagger_ui_ai_response.png new file mode 100644 index 0000000..4bf698b Binary files /dev/null and b/vcore/Labs/media/local_backend_swagger_ui_ai_response.png differ diff --git a/vcore/Labs/media/local_backend_swagger_ui_root_response.png b/vcore/Labs/media/local_backend_swagger_ui_root_response.png new file mode 100644 index 0000000..6c42065 Binary files /dev/null and b/vcore/Labs/media/local_backend_swagger_ui_root_response.png differ diff --git a/vcore/Labs/models/__init__.py b/vcore/Labs/models/__init__.py new file mode 100644 index 0000000..a841b4e --- /dev/null +++ b/vcore/Labs/models/__init__.py @@ -0,0 +1,11 @@ +""" +This module contains the model definitions of objects +that are present in the Cosmic Works dataset. +""" +from .tag import Tag +from .product import Product, ProductList +from .address import Address +from .password import Password +from .customer import Customer, CustomerList +from .sales_order_detail import SalesOrderDetail +from .sales_order import SalesOrder, SalesOrderList diff --git a/vcore/Labs/models/address.py b/vcore/Labs/models/address.py new file mode 100644 index 0000000..45b21a7 --- /dev/null +++ b/vcore/Labs/models/address.py @@ -0,0 +1,16 @@ +""" +Address model +""" +from pydantic import BaseModel, Field + +class Address(BaseModel): + """ + The Address class represents the structure of + an address in the Cosmic Works dataset. + """ + address_line_1: str = Field(alias="addressLine1") + address_line_2: str = Field(alias="addressLine2") + city: str + state: str + country: str + zip_code: str = Field(alias="zipCode") diff --git a/Labs/models/customer.py b/vcore/Labs/models/customer.py similarity index 100% rename from Labs/models/customer.py rename to vcore/Labs/models/customer.py diff --git a/vcore/Labs/models/password.py b/vcore/Labs/models/password.py new file mode 100644 index 0000000..f7836f5 --- /dev/null +++ b/vcore/Labs/models/password.py @@ -0,0 +1,12 @@ +""" +Password model +""" +from pydantic import BaseModel + +class Password(BaseModel): + """ + The Password class represents the structure of + a password in the Cosmic Works dataset. + """ + hash: str + salt: str diff --git a/Labs/models/product.py b/vcore/Labs/models/product.py similarity index 100% rename from Labs/models/product.py rename to vcore/Labs/models/product.py diff --git a/Labs/models/sales_order.py b/vcore/Labs/models/sales_order.py similarity index 100% rename from Labs/models/sales_order.py rename to vcore/Labs/models/sales_order.py diff --git a/vcore/Labs/models/sales_order_detail.py b/vcore/Labs/models/sales_order_detail.py new file mode 100644 index 0000000..97ac275 --- /dev/null +++ b/vcore/Labs/models/sales_order_detail.py @@ -0,0 +1,14 @@ +""" +SalesOrderDetail model +""" +from pydantic import BaseModel + +class SalesOrderDetail(BaseModel): + """ + The SalesOrderDetail class represents invoice line items + for the Sales Order in the Cosmic Works dataset. + """ + sku: str + name: str + price: float + quantity: int diff --git a/vcore/Labs/models/tag.py b/vcore/Labs/models/tag.py new file mode 100644 index 0000000..c208693 --- /dev/null +++ b/vcore/Labs/models/tag.py @@ -0,0 +1,23 @@ +""" +Tag model +""" +from pydantic import BaseModel, Field + +class Tag(BaseModel): + """ + The Tag class represents a tag in the + Cosmic Works dataset. + + Tags are metadata associated with a product. + """ + id: str = Field(default=None, alias="_id") + name: str + + class Config: + """ + The Config inner class is used to configure the + behavior of the Pydantic model. In this case, + the Pydantic model will be able to deserialize + data by both the field name and the field alias. + """ + populate_by_name = True diff --git a/Labs/requirements.txt b/vcore/Labs/requirements.txt similarity index 100% rename from Labs/requirements.txt rename to vcore/Labs/requirements.txt diff --git a/vcore/README.md b/vcore/README.md new file mode 100644 index 0000000..f8b9ec7 --- /dev/null +++ b/vcore/README.md @@ -0,0 +1,18 @@ +# Azure Cosmos DB for MongoDB (vCore) + Azure OpenAI Python Developer Guide + +1. [Introduction](00_Introduction/README.md) +1. [Azure Overview](01_Azure_Overview/README.md) +1. [Overview of Azure Cosmos DB](02_Overview_Cosmos_DB/README.md) +1. [Overview of Azure OpenAI](03_Overview_Azure_OpenAI/README.md) +1. [Overview of AI Concepts](04_Overview_AI_Concepts/README.md) +1. [Explore the Azure OpenAI models and endpoints (console app)](05_Explore_OpenAI_models/README.md) +1. [Provision Azure resources](06_Provision_Azure_Resources/README.md) +1. [Create your first Cosmos DB project](07_Create_First_Cosmos_DB_Project/README.md) +1. [Load data into Azure Cosmos DB API for MongoDB](08_Load_Data/README.md) +1. [Use vector search on embeddings in vCore-based Azure Cosmos DB for MongoDB](09_Vector_Search_Cosmos_DB/README.md) +1. [LangChain](10_LangChain/README.md) +1. [Backend API](11_Backend_API/README.md) +1. [Connect the chat user interface with the chatbot API](12_User_Interface/README.md) +1. [Conclusion](13_Conclusion/README.md) + +![Azure Cosmos DB for MongoDB (vCore) + Azure OpenAI Python Developer Guide Architecture Diagram](06_Provision_Azure_Resources/media/architecture.jpg) diff --git a/assets/Graphics.pptx b/vcore/assets/Graphics.pptx similarity index 100% rename from assets/Graphics.pptx rename to vcore/assets/Graphics.pptx diff --git a/assets/architecture.pptx b/vcore/assets/architecture.pptx similarity index 100% rename from assets/architecture.pptx rename to vcore/assets/architecture.pptx