NeMo Guardrails高级配置：自定义LLM提供商和嵌入搜索方案终极指南-优快云博客

NeMo Guardrails高级配置：自定义LLM提供商和嵌入搜索方案终极指南

【免费下载链接】NeMo-Guardrails NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. 项目地址: https://gitcode.com/gh_mirrors/ne/NeMo-Guardrails

NeMo Guardrails是一个强大的开源工具包，用于为基于LLM的对话系统添加可编程防护栏。在掌握了基础用法后，深入理解如何配置自定义LLM提供商和嵌入搜索方案将帮助你构建更灵活、更强大的AI应用。本文将为你详细介绍NeMo Guardrails高级配置的完整方案，包括自定义LLM提供商和嵌入搜索配置的最佳实践。

为什么需要自定义配置？

在构建企业级AI应用时，你可能会遇到以下需求：

集成私有化LLM模型：使用公司内部训练的专用模型
优化嵌入搜索性能：针对特定场景调整向量检索策略
支持多种云服务：整合OpenAI、Azure、Cohere等不同提供商
实现特定业务逻辑：根据行业需求定制化嵌入搜索方案

自定义LLM提供商配置详解

文本补全模型（BaseLLM）

对于使用字符串提示的模型，你需要继承BaseLLM类：

from nemoguardrails.llm.providers import register_llm_provider

class MyCustomLLM(BaseLLM):
    """自定义文本补全LLM。"""
    
    @property
    def _llm_type(self) -> str:
        return "my_custom_llm"

    async def _acall(
        self,
        prompt: str,
        stop: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> str:
        """异步文本补全（推荐）。"""
        # 在这里实现你的异步逻辑
        return "生成的文本响应"

# 注册提供商
register_llm_provider("my_custom_llm", MyCustomLLM)

聊天模型（BaseChatModel）

对于基于消息的对话模型，你需要继承BaseChatModel类：

from nemoguardrails.llm.providers import register_chat_provider

class MyCustomChatModel(BaseChatModel):
    """自定义聊天模型。"""
    
    @property
    def _llm_type(self) -> str:
        return "my_custom_chat"

    async def _agenerate(
        self,
        messages: List[BaseMessage],
        stop: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> ChatResult:
        """异步聊天补全。"""
        response_text = "生成的聊天响应"
        message = AIMessage(content=response_text)
        generation = ChatGeneration(message=message)
        return ChatResult(generations=[generation])

# 注册提供商
register_chat_provider("my_custom_chat", MyCustomChatModel)

配置文件中使用自定义提供商

在config.yml中配置你的自定义提供商：

models:
  - type: main
    engine: my_custom_llm  # 或 my_custom_chat
    model: 可选模型名称

自定义嵌入搜索方案配置

默认嵌入搜索配置

NeMo Guardrails默认使用FastEmbed计算嵌入，Annoy执行搜索：

core:
  embedding_search_provider:
    name: default
    parameters:
      embedding_engine: FastEmbed
      embedding_model: all-MiniLM-L6-v2
      use_batching: False
    cache:
      enabled: False

支持OpenAI嵌入

你也可以配置使用OpenAI嵌入：

core:
  embedding_search_provider:
    name: default
    parameters:
      embedding_engine: openai
      embedding_model: text-embedding-ada-002

实现自定义嵌入搜索提供商

通过继承EmbeddingsIndex类实现自定义嵌入搜索：

class EmbeddingsIndex:
    """嵌入索引负责计算和搜索一组嵌入。"""
    
    @property
    def embedding_size(self):
        raise NotImplementedError

    async def _get_embeddings(self, texts: List[str]):
        raise NotImplementedError

    async def add_item(self, item: IndexItem):
        """向索引添加新项目。"""
        raise NotImplementedError()

    async def search(self, text: str, max_results: int) -> List[IndexItem]:
        """在索引中搜索与提供文本最接近的匹配项。"""
        raise NotImplementedError()

注册自定义嵌入搜索提供商

在config.py中注册你的自定义嵌入搜索提供商：

def init(app: LLMRails):
    app.register_embedding_search_provider("simple", SimpleEmbeddingSearchProvider)

高级配置最佳实践

1. 异步实现优先

对于更好的性能，始终实现异步方法：

_acall（适用于BaseLLM）
_agenerate（适用于BaseChatModel）

2. 选择正确的基类

使用BaseLLM处理文本补全模型（提示→文本）
使用BaseChatModel处理聊天模型（消息→消息）

3. 启用缓存机制

为了提高嵌入搜索效率，可以启用缓存机制：

core:
  embedding_search_provider:
    name: default
    parameters:
      embedding_engine: openai
    cache:
      enabled: True
      key_generator: sha256
      store: filesystem

4. 批处理优化

默认嵌入提供商包含批处理功能，可在10毫秒延迟后启动嵌入生成过程，优化嵌入生成效率。

实际应用场景示例

集成私有化模型

如果你的公司有内部训练的LLM模型，可以通过自定义LLM提供商无缝集成到NeMo Guardrails中。

多知识库搜索

在RAG应用中，配置不同的嵌入搜索提供商来处理不同的知识库类型。

配置验证和测试

在完成自定义配置后，务必进行以下验证：

功能测试：确保自定义提供商能正确处理请求
性能测试：验证异步方法的性能表现
集成测试：测试与现有防护栏的兼容性

总结

通过掌握NeMo Guardrails的高级配置技巧，你可以构建更加灵活和强大的AI对话系统。自定义LLM提供商和嵌入搜索方案为企业级应用提供了无限的可能性。记住，良好的配置不仅提升系统性能，还能确保AI应用的安全性和可靠性。

通过本文的指南，相信你已经对NeMo Guardrails的高级配置有了全面的了解。现在就开始实践，构建属于你的定制化AI防护系统吧！

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考