.NET项目实战：基于Azure Cosmos DB和OpenAI的智能食谱搜索应用-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_01197/article/details/148525039

.NET项目实战：基于Azure Cosmos DB和OpenAI的智能食谱搜索应用

docs This repository contains .NET Documentation. 项目地址: https://gitcode.com/gh_mirrors/docs2/docs

前言

在当今AI技术蓬勃发展的时代，如何将大型语言模型(LLM)与企业私有数据有效结合，构建智能应用成为开发者关注的重点。本文将介绍如何在.NET应用中实现RAG(检索增强生成)模式，结合Azure Cosmos DB的向量搜索能力和OpenAI的强大语言模型，打造一个智能食谱搜索应用。

技术架构概述

本应用采用以下核心技术组件：

Azure Cosmos DB for MongoDB vCore：作为向量数据的存储和检索引擎
Azure OpenAI服务：提供文本嵌入(embedding)和聊天补全(chat completion)能力
.NET 8：作为应用开发框架

应用工作流程分为四个关键阶段：

数据准备与上传
向量化处理与索引创建
向量相似性搜索
AI增强的响应生成

环境准备

在开始之前，请确保具备以下条件：

安装.NET 8.0 SDK
拥有有效的Azure账户
已创建Azure Cosmos DB for MongoDB vCore服务
已部署Azure OpenAI服务，包含：
- text-embedding-ada-002模型（用于生成嵌入向量）
- gpt-35-turbo模型（用于生成聊天响应）

应用功能详解

1. 数据上传模块

应用启动后，首先需要将食谱数据上传至Cosmos DB。数据以JSON格式存储，包含食谱名称、配料、步骤等关键信息。

核心代码解析：

// 解析本地JSON文件并转换为Recipe对象列表
public static List<Recipe> ParseDocuments(string Folderpath)
{
    List<Recipe> recipes = new List<Recipe>();
    
    Directory.GetFiles(Folderpath)
        .ToList()
        .ForEach(f =>
        {
            var jsonString = System.IO.File.ReadAllText(f);
            Recipe recipe = JsonConvert.DeserializeObject<Recipe>(jsonString);
            recipe.id = recipe.name.ToLower().Replace(" ", "");
            ret.Add(recipe);
        }
    );
    return recipes;
}

上传逻辑采用Upsert操作，确保数据不存在时插入，存在时更新：

public async Task UpsertVectorAsync(Recipe recipe)
{
    BsonDocument document = recipe.ToBsonDocument();
    // 验证文档ID
    if (!document.Contains("_id")) { /* 错误处理 */ }
    
    try {
        var filter = Builders<BsonDocument>.Filter.Eq("_id", _idValue);
        var options = new ReplaceOptions { IsUpsert = true };
        await _recipeCollection.ReplaceOneAsync(filter, document, options);
    }
    catch (Exception ex) { /* 异常处理 */ }
}

2. 向量化处理与索引创建

原始文本数据需要转换为向量表示才能支持相似性搜索。这里使用OpenAI的嵌入模型将食谱内容转换为1536维的向量。

向量生成核心代码：

public async Task<float[]?> GetEmbeddingsAsync(dynamic data)
{
    try {
        EmbeddingsOptions options = new EmbeddingsOptions(data) {
            Input = data
        };
        
        var response = await _openAIClient.GetEmbeddingsAsync(
            openAIEmbeddingDeployment, options);
        
        return response.Value.Data[0].Embedding.ToArray();
    }
    catch (Exception ex) { /* 异常处理 */ }
}

创建向量索引以支持高效搜索：

public void CreateVectorIndexIfNotExists(string vectorIndexName)
{
    // 检查索引是否已存在
    bool vectorIndexExists = indexCursor.ToList()
        .Any(x => x["name"] == vectorIndexName);
    
    if (!vectorIndexExists) {
        // 使用IVF算法创建向量索引
        BsonDocumentCommand<BsonDocument> command = new BsonDocumentCommand<BsonDocument>(
            BsonDocument.Parse(@"
                { createIndexes: 'Recipe',
                  indexes: [{
                    name: 'vectorSearchIndex',
                    key: { embedding: 'cosmosSearch' },
                    cosmosSearchOptions: {
                        kind: 'vector-ivf',
                        numLists: 5,
                        similarity: 'COS',
                        dimensions: 1536 }
                  }]
                }"));
        
        // 执行创建索引命令
        BsonDocument result = _database.RunCommand(command);
    }
}

3. 向量搜索实现

当用户输入查询时，先将查询文本转换为向量，然后在Cosmos DB中执行向量相似性搜索：

public async Task<List<Recipe>> VectorSearchAsync(float[] queryVector)
{
    try {
        // 构建聚合管道
        BsonDocument[] pipeline = new BsonDocument[] {
            BsonDocument.Parse(@$"{{$search: {{
                    cosmosSearch: {{
                        vector: [{string.Join(',', queryVector)}],
                        path: 'embedding',
                        k: {_maxVectorSearchResults}}},
                        returnStoredSource:true
                    }}
                }}"),
            BsonDocument.Parse($"{{$project: {{embedding: 0}}}}"),
        };
        
        // 执行搜索
        var bsonDocuments = await _recipeCollection
            .Aggregate<BsonDocument>(pipeline).ToListAsync();
        
        // 转换为Recipe对象列表
        return bsonDocuments.ConvertAll(bsonDocument =>
            BsonSerializer.Deserialize<Recipe>(bsonDocument));
    }
    catch (MongoException ex) { /* 异常处理 */ }
}

4. AI增强响应生成

搜索到相关食谱后，使用GPT模型生成更自然、更符合用户需求的响应：

public async Task<(string response, int promptTokens, int responseTokens)> 
    GetChatCompletionAsync(string userPrompt, string documents)
{
    try {
        // 构建系统提示和用户消息
        ChatMessage systemMessage = new ChatMessage(
            ChatRole.System, _systemPromptRecipeAssistant + documents);
        ChatMessage userMessage = new ChatMessage(
            ChatRole.User, userPrompt);
        
        // 配置补全选项
        ChatCompletionsOptions options = new() {
            Messages = { systemMessage, userMessage },
            MaxTokens = openAIMaxTokens,
            Temperature = 0.5f,
            NucleusSamplingFactor = 0.95f
        };
        
        // 获取补全结果
        var completions = await openAIClient
            .GetChatCompletionsAsync(openAICompletionDeployment, options);
        
        return (
            response: completions.Value.Choices[0].Message.Content,
            promptTokens: completions.Value.Usage.PromptTokens,
            responseTokens: completions.Value.Usage.CompletionTokens
        );
    }
    catch (Exception ex) { /* 异常处理 */ }
}

精心设计的系统提示词确保响应格式化和内容质量：

private readonly string _systemPromptRecipeAssistant = @"
    你是Contoso食谱的智能助手。
    你的任务是根据下方提供的JSON格式食谱数据，
    回答用户关于食谱和烹饪步骤的问题。

    回答要求：
    - 仅回答与提供食谱相关的问题
    - 不要引用未提供的食谱
    - 如果不确定答案，回答""我不知道""并建议用户自行搜索
    - 回答应完整详细
    - 首先列出食谱名称，然后分步骤说明烹饪方法
    - 假设用户不是烹饪专家
    - 格式化内容使其适合命令行显示
    - 如果找到多个相关食谱，让用户选择最合适的";