Marqo项目实战：5行代码实现文本到图像搜索-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00253/article/details/148508471

Marqo项目实战：5行代码实现文本到图像搜索

marqo Vector search for humans. Also available on cloud - cloud.marqo.ai 项目地址: https://gitcode.com/gh_mirrors/ma/marqo

前言

在当今多模态AI快速发展的时代，如何高效地实现跨模态搜索成为了一个重要课题。本文将介绍如何使用Marqo这一强大的开源张量搜索引擎，仅用5行核心代码就能构建一个完整的文本到图像搜索系统。

环境准备

安装Marqo

首先我们需要通过Docker启动Marqo服务：

docker rm -f marqo
docker pull marqoai/marqo:2.0.0
docker run --name marqo -it -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:2.0.0

然后创建Python环境并安装客户端：

conda create -n marqo-client python=3.8
conda activate marqo-client
pip install marqo matplotlib

验证安装是否成功：

import marqo
mq = marqo.Client("http://localhost:8882")

准备测试数据

我们使用COCO数据集中的5张示例图片作为测试数据。这些图片包含了各种日常场景，非常适合演示跨模态搜索功能。

核心实现步骤

1. 创建索引

索引是Marqo中存储和检索数据的基本单位。我们需要创建一个支持多模态搜索的索引：

index_name = 'image-search-guide'

settings = {
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
    "treatUrlsAndPointersAsImages": True,
}

mq.create_index(index_name, settings_dict=settings)

关键参数说明：

model：指定使用CLIP家族的视觉语言模型
treatUrlsAndPointersAsImages：必须设置为True以启用多模态搜索功能

2. 处理本地图像

由于Marqo运行在Docker容器中，我们需要让容器能够访问本地图像文件。解决方案是启动一个简单的HTTP服务器：

import subprocess
local_dir = "./data/"
pid = subprocess.Popen(['python3', '-m', 'http.server', '8222', '--directory', local_dir], 
                      stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)

然后生成Docker可访问的图像路径：

import glob
import os

locators = glob.glob(local_dir + '*.jpg')
docker_path = "http://host.docker.internal:8222/"
image_docker = [docker_path + os.path.basename(f) for f in locators]

3. 构建文档并索引

Marqo要求文档以特定格式输入：

documents = [{"image_docker": image, "_id": str(idx)} 
            for idx, image in enumerate(image_docker)]

将文档添加到索引中：

mq.index(index_name).add_documents(
    documents, 
    tensor_fields=["image_docker"], 
    device="cpu", 
    client_batch_size=1
)

4. 执行搜索

现在我们可以用自然语言描述来搜索图像了：

search_results = mq.index(index_name).search(
    "A rider on a horse jumping over the barrier", 
    limit=1,
    device='cpu'
)

查看搜索结果：

from PIL import Image
fig_path = search_results["hits"][0]["image_docker"].replace(docker_path, local_dir)
display(Image.open(fig_path))

性能优化建议

GPU加速：如果有CUDA设备，设置device="cuda"可以显著提高索引和搜索速度
批量处理：适当增加client_batch_size参数可以提高大批量数据的处理效率
模型选择：根据具体需求选择合适的CLIP模型变体

应用场景扩展

Marqo的多模态搜索能力可以应用于多种场景：

电商平台的视觉搜索
内容管理中的图像识别
多媒体资源管理
社交媒体内容推荐

总结

通过本文我们了解到，使用Marqo实现文本到图像搜索非常简单高效。核心流程可以概括为：

环境配置
创建支持多模态的索引
准备并索引图像数据
执行自然语言搜索

Marqo的强大之处在于它抽象了复杂的多模态嵌入和相似度计算过程，让开发者可以专注于业务逻辑。对于想要快速实现跨模态搜索功能的团队来说，Marqo无疑是一个值得考虑的优秀解决方案。

marqo Vector search for humans. Also available on cloud - cloud.marqo.ai 项目地址: https://gitcode.com/gh_mirrors/ma/marqo

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考