graphrag+ollama的两个“坑”与解决办法

https://microsoft.github.io/graphrag/posts/get_started/

根据官方的步骤,修改setting到ollama本地,会出现如下两个常见问题:

找起来很费时间 ,分享给大家希望有帮助

global提问报错:json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

解决办法:

I found out that my local Ollama instance (0.3.0) seemed to ignore the system prompt and I got it working by manually stitching together the two prompts into one: File: /graphrag/query/structured_search/global_search/search.py ,

https://github.com/microsoft/graphrag/issues/575

一定要看清楚报错的是哪个文件夹下的/graphrag/query/structured_search/global_search/search.py

修改这个search.py

搜索:_map_response_single_batch,然后修改

map_response_single_batch
#search_messages = [
#  {"role": "system", "content": search_prompt},
#  {"role": "user", "content": query},
#]
search_messages = [ {"role": "user", "content": search_prompt + "\n\n### USER QUESTION ### \n\n" + query} ]

local提问报错:ZeroDivisionError: Weights sum to zero, can't be normalized

解决办法:

Yes this is due to your locally run embedding model, not returning the weights in a correct format. OpenAI uses internally base64 encoded floats, and most other models will return floats as numbers.

https://github.com/microsoft/graphrag/issues/619

https://github.com/microsoft/graphrag/issues/345

一定要看清楚报错的是哪个文件夹下的graphrag\query\llm\oai\embedding.py

修改这个embedding.py

# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

"""OpenAI Embedding model implementation."""

import asyncio
from collections.abc import Callable
from typing import Any

import numpy as np
import tiktoken
from tenacity import (
    AsyncRetrying,
    RetryError,
    Retrying,
    retry_if_exception_type,
    stop_after_attempt,
    wait_exponential_jitter,
)

from graphrag.query.llm.base import BaseTextEmbedding
from graphrag.query.llm.oai.base import OpenAILLMImpl
from graphrag.query.llm.oai.typing import (
    OPENAI_RETRY_ERROR_TYPES,
    OpenaiApiType,
)
from graphrag.query.llm.text_utils import chunk_text
from graphrag.query.progress import StatusReporter

from langchain_community.embeddings import OllamaEmbeddings


class OpenAIEmbedding(BaseTextEmbedding, OpenAILLMImpl):
    """Wrapper for OpenAI Embedding models."""

    def __init__(
        self,
        api_key: str | None = None,
        azure_ad_token_provider: Callable | None = None,
        model: str = "text-embedding-3-small",
        deployment_name: str | None = None,
        api_base: str | None = None,
        api_version: str | None = None,
        api_type: OpenaiApiType = OpenaiApiType.OpenAI,
        organization: str | None = None,
        encoding_name: str = "cl100k_base",
        max_tokens: int = 8191,
        max_retries: int = 10,
        request_timeout: float = 180.0,
        retry_error_types: tuple[type[BaseException]] = OPENAI_RETRY_ERROR_TYPES,  # type: ignore
        reporter: StatusReporter | None = None,
    ):
        OpenAILLMImpl.__init__(
            self=self,
            api_key=api_key,
            azure_ad_token_provider=azure_ad_token_provider,
            deployment_name=deployment_name,
            api_base=api_base,
            api_version=api_version,
            api_type=api_type,  # type: ignore
            organization=organization,
            max_retries=max_retries,
            request_timeout=request_timeout,
            reporter=reporter,
        )

        self.model = model
        self.encoding_name = encoding_name
        self.max_tokens = max_tokens
        self.token_encoder = tiktoken.get_encoding(self.encoding_name)
        self.retry_error_types = retry_error_types

    def embed(self, text: str, **kwargs: Any) -> list[float]:
        """
        Embed text using OpenAI Embedding's sync function.

        For text longer than max_tokens, chunk texts into max_tokens, embed each chunk, then combine using weighted average.
        Please refer to: https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
        """
        token_chunks = chunk_text(
            text=text, token_encoder=self.token_encoder, max_tokens=self.max_tokens
        )
        chunk_embeddings = []
        chunk_lens = []
        for chunk in token_chunks:
            try:
                embedding, chunk_len = self._embed_with_retry(chunk, **kwargs)
                chunk_embeddings.append(embedding)
                chunk_lens.append(chunk_len)
            # TODO: catch a more specific exception
            except Exception as e:  # noqa BLE001
                self._reporter.error(
                    message="Error embedding chunk",
                    details={self.__class__.__name__: str(e)},
                )

                continue
        chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)
        chunk_embeddings = chunk_embeddings / np.linalg.norm(chunk_embeddings)
        return chunk_embeddings.tolist()

    async def aembed(self, text: str, **kwargs: Any) -> list[float]:
        """
        Embed text using OpenAI Embedding's async function.

        For text longer than max_tokens, chunk texts into max_tokens, embed each chunk, then combine using weighted average.
        """
        token_chunks = chunk_text(
            text=text, token_encoder=self.token_encoder, max_tokens=self.max_tokens
        )
        chunk_embeddings = []
        chunk_lens = []
        embedding_results = await asyncio.gather(*[
            self._aembed_with_retry(chunk, **kwargs) for chunk in token_chunks
        ])
        embedding_results = [result for result in embedding_results if result[0]]
        chunk_embeddings = [result[0] for result in embedding_results]
        chunk_lens = [result[1] for result in embedding_results]
        chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)  # type: ignore
        chunk_embeddings = chunk_embeddings / np.linalg.norm(chunk_embeddings)
        return chunk_embeddings.tolist()

    def _embed_with_retry(
        self, text: str | tuple, **kwargs: Any
    ) -> tuple[list[float], int]:
        try:
            retryer = Retrying(
                stop=stop_after_attempt(self.max_retries),
                wait=wait_exponential_jitter(max=10),
                reraise=True,
                retry=retry_if_exception_type(self.retry_error_types),
            )
            for attempt in retryer:
                with attempt:
                    embedding = (
                        OllamaEmbeddings(
                            model=self.model,
                        ).embed_query(text)
                        or []
                    )
                    return (embedding, len(text))
        except RetryError as e:
            self._reporter.error(
                message="Error at embed_with_retry()",
                details={self.__class__.__name__: str(e)},
            )
            return ([], 0)
        else:
            # TODO: why not just throw in this case?
            return ([], 0)

    async def _aembed_with_retry(
        self, text: str | tuple, **kwargs: Any
    ) -> tuple[list[float], int]:
        try:
            retryer = AsyncRetrying(
                stop=stop_after_attempt(self.max_retries),
                wait=wait_exponential_jitter(max=10),
                reraise=True,
                retry=retry_if_exception_type(self.retry_error_types),
            )
            async for attempt in retryer:
                with attempt:
                    embedding = (
                        await OllamaEmbeddings(
                            model=self.model,
                        ).embed_query(text) or [] )
                    return (embedding, len(text))
        except RetryError as e:
            self._reporter.error(
                message="Error at embed_with_retry()",
                details={self.__class__.__name__: str(e)},
            )
            return ([], 0)
        else:
            # TODO: why not just throw in this case?
            return ([], 0)

改了哪些部分?

增加了一行

from langchain_community.embeddings import OllamaEmbeddings

然后修改两处embedding

搜索:for attempt in retryer:

搜索:async for attempt in retryer: with attempt:

你会发现不一样的地方 替换掉

### GraphRAGOllama 项目概述 GraphRAG 是一种专注于图数据管理和处理的技术框架,旨在提供高效、灵活的解决方案来应对大规模图数据分析的需求[^1]。该平台支持分布式存储和计算能力,使得复杂查询可以在海量节点间快速执行。 Ollama 则是一个基于图形化界面的操作工具集,专为简化 GraphRAG 的部署、配置以及日常运维而设计[^2]。通过直观易用的UI组件,管理员可以轻松完成集群管理、性能监控等一系列操作任务。 #### 使用教程 对于初次接触这两个项目的开发者来说,官方文档提供了详尽的手册指导如何安装设置环境并启动服务实例: - **准备工作**:确保本地机器已安装必要的依赖项如 Java 运行时环境 (JRE),并且网络连接正常以便下载远程资源文件。 - **获取软件包**:访问官方网站页面下载最新版本的应用程序压缩包,并解压至指定目录下。 - **初始化配置**:编辑 conf 文件夹内的 settings.xml 来调整参数选项以适应特定应用场景下的需求;例如设定监听端口、日志级别等基本信息。 - **运行命令**:打开终端窗口进入 bin 子目录,依次输入 `./start.sh` 启动后台进程,等待几秒钟后即可验证是否成功上线。 ```bash cd /path/to/graphrag/bin/ ./start.sh ``` #### 源码结构分析 开源社区托管了完整的源代码库,在 GitHub 上可找到对应的仓库链接地址。整个工程按照模块划分成多个子项目,便于维护者分工协作开发新特性或是修复潜在缺陷。核心部分主要包括但不限于以下几个方面: - 数据模型定义:描述实体之间的关系模式及其属性特征; - 查询解析引擎:负责将SQL-like语句转换为目标语言表达式树; - 分布式协调机制:实现跨服务器间的同步通信协议保障事务一致性。 #### 关于分布式图数据库的特点 作为一款先进的 NoSQL 解决方案,GraphRAG 支持水平扩展架构,允许用户根据业务增长情况动态增加硬件设备数量而不影响现有系统的稳定性。此外,内置索引优化算法能够显著提升检索效率,降低延迟时间,从而更好地服务于实时推荐系统等领域应用案例[^3]。
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值