pymilvus.exceptions.MilvusException: ＜MilvusException: (code=65535, message=cannot parse expression:

叹彡远方

已于 2024-12-19 13:30:50 修改

阅读量2.8k

点赞数 48

文章标签： python milvus

于 2024-12-19 00:25:46 首次发布

本文链接：https://blog.youkuaiyun.com/qq_47060526/article/details/144572492

版权

用id在Milvus向量数据库检索时，报错

我在向Milvus存document的时候，用的是官方给出的：Question Answering over Documents with Milvus and LangChain中的代码

embeddings = OpenAIEmbeddings()
connection_args = { 'uri': URI, 'token': TOKEN }

vector_store = Milvus(
    embedding_function=embeddings,
    connection_args=connection_args,
    collection_name=COLLECTION_NAME,
    drop_old=True,
).from_documents(
    all_splits,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    connection_args=connection_args,
)

点击这里查看官方网站

经查证数据库里确实没有id字段
在这里插入图片描述
但是字段pk其实就是id，索引的时候，需要将你的query代码里的expr

collection.query(expr=f"id in [{res_id}]", output_fields=["text"])

改为:

collection.query(expr=f"pk in [{res_pk}]", output_fields=["text"])

从官方文档中查找参数才发现，auto_id (bool)默认为False，需要手动传参，也就是这样：

vector_store = Milvus(
    embedding_function=embeddings,
    connection_args=connection_args,
    collection_name=COLLECTION_NAME,
    drop_old=True,
    auto_id=True,
).from_documents(
    all_splits,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    connection_args=connection_args,
)

但当我重新设置auto_id=True的时候，发现新建的collection字段与原collection字段并没有什么不同，不知道用上述方法将document嵌入Milvus如何自定义字段名。希望有大佬能帮忙解答，感激不尽！！！

此外列举一些其他参数，万一哪天用到了，方便我再回来看。

embedding_function (Embeddings): Function used to embed the text. 用于嵌入文本的函数。
collection_name (str): Which Milvus collection to use. Defaults to
“LangChainCollection”. 使用哪个Milvus集合。默认为“LangChainCollection”
collection_description (str): The description of the collection. Defaults to “”.集合的描述。默认为""。
collection_properties (Optional[dict[str, any]]): The collection properties. Defaults to None. If set, will override collection existing properties. 集合属性。默认为None。如果设置，将覆盖集合现有属性。
For example: {“collection.ttl.seconds”: 60}. 例如：{"collection.ttl。60秒”:}。
connection_args (Optional[dict[str, any]]): The connection args used for this class comes in the form of a dict.用于的连接参数这个类以字典的形式出现。
consistency_level (str): The consistency level to use for a collection.
Defaults to “Session”.用于集合的一致性级别。默认为“Session”。
index_params (Optional[dict]): Which index params to use. Defaults to
HNSW/AUTOINDEX depending on service.使用哪些索引参数。默认为HNSW/AUTOINDEX取决于服务。
search_params (Optional[dict]): Which search params to use. Defaults to default of index.要使用的搜索参数。默认为索引默认值。
drop_old (Optional[bool]): Whether to drop the current collection. Defaults to False.是否删除当前集合。默认为False。
auto_id (bool):Whether to enable auto id for primary key. Defaults to False.是否为主键启用自动id。默认为False。