使用 Azure Machine Learning 在线终端部署聊天模型-优快云博客

本文链接：https://blog.youkuaiyun.com/vaidfl/article/details/146437761

在现代机器学习应用中，Azure Machine Learning 提供了一个强大的平台，用于构建、训练和部署模型。本文将重点介绍如何使用 Azure Machine Learning 的在线终端来部署和调用聊天模型，从而实现实时推理服务。

技术背景介绍

Azure Machine Learning 是一个全面的机器学习平台，支持各种模型的构建与部署。在部署模型以使用其预测（推理）时，在线终端（Online Endpoints）是非常重要的组成部分。它们允许用户将工作负载的接口与实现进行解耦，从而方便地进行模型的更新或扩展。

核心原理解析

在线终端通过以下两个关键概念来运作：

Endpoints：终端提供了一种统一的接口来处理输入请求和输出响应。
Deployments：部署描述了服务的实现细节，其中包括模型、资源配置等。

代码实现演示

下面的示例代码展示了如何使用在线终端部署一个聊天模型，并进行推理调用。

from langchain_community.chat_models.azureml_endpoint import (
    AzureMLChatOnlineEndpoint,
    AzureMLEndpointApiType,
    CustomOpenAIChatContentFormatter
)
from langchain_core.messages import HumanMessage

# 创建一个 Azure ML 聊天在线终端实例，指定终端URL和API类型
chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/score",
    endpoint_api_type=AzureMLEndpointApiType.dedicated,
    endpoint_api_key="your-endpoint-api-key",
    content_formatter=CustomOpenAIChatContentFormatter()
)

# 调用聊天模型进行推理并获取响应
response = chat.invoke(
    [HumanMessage(content="Will the Collatz conjecture ever be solved?")]
)

# 输出响应结果
print(response)

在此示例中，我们使用 dedicated API 类型的终端，适合高性能需求的模型部署。如果您选择使用 serverless 类型，则可以如以下代码设置：

chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/v1/chat/completions",
    endpoint_api_type=AzureMLEndpointApiType.serverless,
    endpoint_api_key="your-endpoint-api-key",
    content_formatter=CustomOpenAIChatContentFormatter(),
)
response = chat.invoke(
    [HumanMessage(content="Will the Collatz conjecture ever be solved?")]
)
print(response)