This reference implementation illustrates a basic approach for authoring and running a chat application in a single region with Azure Machine Learning and Azure OpenAI. This reference implementation supports the Basic Azure OpenAI end-to-end chat reference architecture.
The implementation takes advantage of Prompt flow in Azure Machine Learning to build and deploy flows that can link the following actions required by a generative AI chat application:
- Creating prompts
- Querying data stores for grounding data
- Python code
- Calling language models (such as GPT models)
The reference implementation illustrates a basic example of a chat application. For a reference implementation that implements enterprise requirements, please see the OpenAI end-to-end baseline reference implementation.
The implementation covers the following scenarios:
- Authoring a flow - Authoring a flow using Prompt flow in an Azure Machine Learning workspace.
- Deploying a flow - The client UI is hosted in Azure App Service and accesses the Azure OpenAI Service via a Machine Learning managed online endpoint.
The Azure Machine Learning deployment architecture diagram illustrates how a front-end web application connects to a managed online endpoint.
Follow these instructions to deploy this example to your Azure subscription, try out what you've deployed, and learn how to clean up those resources.
-
An Azure subscription with the following resource providers registered.
Microsoft.AlertsManagement
Microsoft.CognitiveServices
Microsoft.ContainerRegistry
Microsoft.KeyVault
Microsoft.Insights
Microsoft.MachineLearningServices
Microsoft.ManagedIdentity
Microsoft.OperationalInsights
Microsoft.Storage
-
Your user has permissions to assign Azure roles, such as a User Access Administrator or Owner.
The following steps are required to deploy the infrastructure from the command line.
-
In your shell, clone this repo and navigate to the root directory of this repository.
git clone https://github.com/Azure-Samples/openai-end-to-end-basic cd openai-end-to-end-basic
-
Log in and set subscription
az login az account set --subscription xxxxx
-
Create a resource group and deploy the infrastructure.
LOCATION=eastus BASE_NAME=<base resource name, between 6 and 8 lowercase characters, most resource names will include this text> RESOURCE_GROUP=rg-chat-basic-${LOCATION} az group create -l $LOCATION -n $RESOURCE_GROUP # This takes about 10 minutes to run. az deployment group create -f ./infra-as-code/bicep/main.bicep \ -g $RESOURCE_GROUP \ -p baseName=${BASE_NAME}
-
Assign your account the
Cognitive Services OpenAI User
role on the Azure OpenAI instance. This is required to interact with the Azure OpenAI Service via the Machine Learning Workspace.
-
Open the Machine Learning Workspace and choose your workspace. Ensure you have enabled Prompt flow in your Azure Machine Learning workspace.
-
Create a prompt flow connection to your gpt35 Azure OpenAI deployment. This will be used by the prompt flow you clone in the next step.
- Click on 'Connections' in the left navigation in Machine Learning Studio
- Click the 'Create' button
- Click 'Azure OpenAI Service'
- Your Azure OpenAI instance should be displayed. Change the Authentication method to 'Microsoft Entra ID'
- Click the 'Add connection' button
-
Clone an existing prompt flow
- Click on 'Prompt flow' in the left navigation in Machine Learning Studio
- Click on the 'Flows' tab and click 'Create'
- Click 'Clone' under 'Chat with Wikipedia'
- Name it 'chat_wiki' and Press 'Clone'
-
Connect the Prompt flow to your Azure OpenAI instance
-
For extract_query_from_question:
- For 'Connection,' select your Azure OpenAI instance from the dropdown menu
- For 'deployment_name', select 'gpt35' from the dropdown menu
- For 'response_format', select '{"type":"text"}' from the dropdown menu
-
For augmented_chat:
- For 'Connection,' select your Azure OpenAI instance from the dropdown menu
- For 'deployment_name', select 'gpt35' from the dropdown menu
- For 'response_format', select '{"type":"text"}' from the dropdown menu
-
Click 'Save' to save your changes
-
-
Test the flow
- Click 'Start compute session' (This may take around 5 minutes)
- Click 'Chat' on the UI
- Enter a question
- A response to your question should appear on the UI
-
Create a deployment in the UI
- Click on 'Deploy' in the UI
- Choose 'Existing' Endpoint and select the one called ept-<basename>
- Choose a small Virtual Machine size for testing and set the number of instances
- Click 'Review + Create'
- Click 'Create'
The baseline architecture uses run from zip file in App Service. This approach has many benefits, including eliminating file lock conflicts when deploying.
APPSERVICE_NAME=app-$BASE_NAME
az webapp deploy --resource-group $RESOURCE_GROUP --name $APPSERVICE_NAME --type zip --src-url https://raw.githubusercontent.com/Azure-Samples/openai-end-to-end-basic/main/website/chatui.zip
After the deployment is complete, you can try the deployed application by navigating to the AppService URL in a web browser. Once you're there, ask your solution a question, ideally one that involves recent data or events, something that would only be known by the RAG process including content from Wikipedia.
Most of the Azure resources deployed in the prior steps will incur ongoing charges unless removed. Also a few of the resources deployed go into a soft delete status. It's best to purge those once you're done exploring, Key Vault is given as an example here. Azure OpenAI and Azure Machine Learning Workspaces are others that should be purged.
az group delete --name $RESOURCE_GROUP -y
# Purge the soft delete resources
az keyvault purge -n kv-${BASE_NAME}
Please see our Contributor guide.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
With ❤️ from Azure Patterns & Practices, Azure Architecture Center.