sphinx 入门_Sphinx搜索引擎入门

sphinx 入门

In this article, we will be talking about the Sphinx search engine and how to use it to install it on the Windows operating system.

在本文中,我们将讨论Sphinx搜索引擎以及如何使用它在Windows操作系统上安装它。

This is the first article in the “Sphinx Search Engine” series, where we will explain how to install and use this search engine to create full-text indexes over relational databases (SQL Server).

这是“ Sphinx搜索引擎”系列的第一篇文章,我们将在其中解释如何安装和使用此搜索引擎在关系数据库(SQL Server)上创建全文索引。

介绍 (Introduction)

Sphinx (SQL Phrase Index) is a standalone full-text search engine that provides efficient search functionality to third party applications, especially SQL databases. This search engine was developed in 2001 by a Russian developer named Andrew Aksyonoff to guarantee a (1) good search quality, (2) performed at high speed (3) with a low resource consumption (Disk IO, CPU). It can be integrated with scripting languages such as Python and Java.

狮身人面像(S QL 博士 RASE x)是一个独立的全文搜索引擎,提供高效的搜索功能,第三方应用程序,特别是SQL数据库。 该搜索引擎由俄罗斯开发人员Andrew Aksyonoff于2001年开发,以确保(1)良好的搜索质量,(2)高速执行(3)且资源消耗低(磁盘IO,CPU)。 它可以与Python和Java等脚本语言集成。

The Sphinx search engine has its own data source drivers that are used to interact with different database management systems. We must specify the driver we need in the configuration files.

Sphinx搜索引擎具有自己的数据源驱动程序,用于与不同的数据库管理系统进行交互。 我们必须在配置文件中指定所需的驱动程序。

In a research paper published in 2017 by a group of researchers at Moscow Technological University, a quick comparison is made between four popular search engines (Sphinx, Apache Solr, ElasticSearch, and Xapian). The result (shown in the table below) shows that the Sphinx search engine has the fastest indexing speed (4.5 Mb/sec) and a very fast search speed (7/75 ms).

2017年由莫斯科科技大学的一组研究人员发表研究论文中 ,快速比较了四种流行的搜索引擎(Sphinx,Apache Solr,ElasticSearch和Xapian)。 结果(如下表所示)显示,Sphinx搜索引擎具有最快的索引速度(4.5 Mb /秒)和非常快的搜索速度(7/75 ms)。

Sphinx

Solr

Elasticsearch

Xapian

Indexing speed (Mb/s)

4.5

2.75

3.8

1.36

Search speed (ms)

7/75

25/212

10/212

14/135

Index size (%)

30

20

20

200

Realization

Server

Server

Library

Library

Interface

API, SQL

Web-service

API

API

Search operators

Boolean, prefix search, exact phrase, words near, ranges, word order, zones

Boolean, prefix search (+ wildcards), exact phrase, words near, ranges, approximate search

Boolean, prefix search (+ wildcards), exact phrase, words near, ranges, approximate search

Boolean, prefix search, exact phrase, words near, ranges, approximate search

狮身人面像

索尔

弹性搜索

Xapian

分度速度(Mb / s)

4.5

2.75

3.8

1.36

搜索速度(毫秒)

7/75

25/212

10/212

14/135

索引大小(%)

30

20

20

200

实现

服务器

服务器

图书馆

图书馆

接口

API,SQL

网络服务

API

API

搜索运算符

布尔值,前缀搜索,精确短语,附近的单词,范围,单词顺序,区域

布尔值,前缀搜索(+通配符),精确短语,附近的单词,范围,近似搜索

布尔值,前缀搜索(+通配符),精确短语,附近的单词,范围,近似搜索

布尔值,前缀搜索,精确短语,近词,范围,近似搜索

下载Sphinx搜索引擎
(Downloading Sphinx search engine
)

First of all, we should download the latest version (for now, the latest version is 3.2.1) of the Sphinx search engine from the following link.

首先,我们应该从以下链接下载 Sphinx搜索引擎的最新版本(目前,最新版本为3.2.1)

Sphinx download link

  • Note: In this guide, we will use “E:\Sphinx” as the installation directory.

    注意 :在本指南中,我们将使用“ E:\ Sphinx”作为安装目录。

After downloading the binaries package, we should extract its content (as shown in the image below, we used 7zip as an extraction tool).

下载完二进制文件包后,我们应该提取其内容(如下图所示,我们使用7zip作为提取工具)。

Extracting downloaded package

设置狮身人面像
(Setting up Sphinx
)

After extracting the package, we should add a folder called “data” within the extracted directory to store indexes. Then we should create three folders called “index”, “log”, and “binlog” within the created “data” directory.

解压缩软件包后,我们应该在解压缩的目录中添加一个名为“ data”的文件夹来存储索引。 然后,我们应该在创建的“数据”目录中创建三个名为“索引”,“日志”和“ binlog”的文件夹。

The extracted directory

Adding the data directory to the Sphinx search engine installation folder

Adding the binlog, index and log folders into the data directory

It is good to know that Sphinx has two primary services:

很高兴知道Sphinx有两个主要服务:

  1. Indexer: This service is used to build full-text indexes. By default, Sphinx read the source tables from the configuration file located in “<installation directory>\etc\sphinx.conf” 索引器 :此服务用于构建全文本索引。 默认情况下,Sphinx从位于“ <安装目录> \ etc \ sphinx.conf”中的配置文件读取源表。
  2. Searchd: This is the daemon used for searching the created indexes. It requires a client to access the Sphinx API 搜索 :这是用于搜索创建的索引的守护程序。 它要求客户端访问Sphinx API

First, we should create a windows service to run the Searchd daemon. To do this, we can use the following command from the Windows command prompt:

首先,我们应该创建一个Windows服务来运行Searchd守护程序。 为此,我们可以在Windows命令提示符下使用以下命令:

E:\Sphinx\sphinx-3.2.1\bin\searchd –install –config
E:\Sphinx\sphinx-3.2.1\etc\sphinx.conf –servicename SphinxSearch

E:\ Sphinx \ sphinx-3.2.1 \ bin \ searchd –install –config
E:\ Sphinx \ sphinx-3.2.1 \ etc \ sphinx.conf –服务名称SphinxSearch

Creating a Windows service for the Sphinx search engine

To check that windows service is created successfully, we can go to Services and check if the SphinxSearch service is added.

要检查是否成功创建了Windows服务,我们可以转到服务并检查是否添加了SphinxSearch服务。

SphinxSearch windows service created

Note that before setting up the Sphinx configuration file, this Windows service cannot be started.

请注意,在设置Sphinx配置文件之前,无法启动此Windows服务。

To configure Sphinx, we should create the “Sphinx.conf” file within the “E:\Sphinx\sphinx-3.2.1\etc” directory. Then we should first add the following lines:

要配置Sphinx,我们应该在“ E:\ Sphinx \ sphinx-3.2.1 \ etc”目录中创建“ Sphinx.conf”文件。 然后,我们应该首先添加以下几行:

searchd
{
listen = 9306:mysql41
pid_file = E:/sphinx/sphinx-3.2.1/data/searchd.pid
log = E:/sphinx/sphinx-3.2.1/data/log/log.txt
query_log = E:/sphinx/sphinx-3.2.1/data/log/query_log.txt
binlog_path = E:/sphinx/sphinx-3.2.1/data/binlog/
}

搜索过
{
听= 9306:mysql41
pid_file = E:/sphinx/sphinx-3.2.1/data/searchd.pid
日志= E:/sphinx/sphinx-3.2.1/data/log/log.txt
query_log = E:/sphinx/sphinx-3.2.1/data/log/query_log.txt
binlog_path = E:/sphinx/sphinx-3.2.1/data/binlog/
}

The listen option specifies that Sphinx will use port 9306 and the MySQL protocol. Using the MySQL protocol allows you to connect to Sphinx as a regular MySQL database. The pid_file setting specifies the location of the .pid file that is used internally. Setup log and query_log indicate the location of the log files, which record all the events. The binlog_path setting specifies the location of the files that can be used to restore real-time index data after a failure.

listen选项指定Sphinx将使用端口9306和MySQL协议。 使用MySQL协议,您可以将Sphinx作为常规MySQL数据库进行连接。 pid_file设置指定内部使用的.pid文件的位置。 设置日志query_log指示记录所有事件的日志文件的位置。 binlog_path设置指定发生故障后可用于还原实时索引数据的文件的位置。

To start Sphinx, we must at least create one index in the configuration file. In this article, we will define a fake real-time index by adding the following lines:

要启动Sphinx,我们至少必须在配置文件中创建一个索引。 在本文中,我们将通过添加以下几行来定义一个虚假的实时索引:

index fake_index
{
type = rt
path = E:/sphinx/sphinx-3.2.1/data/index/fake_index
rt_field = fake_field
}

索引fake_index
{
类型= rt
路径= E:/sphinx/sphinx-3.2.1/data/index/fake_index
rt_field =假字段
}

Now, let’s try to use the indexer service to build the indexes using the following command:

现在,让我们尝试使用索引器服务通过以下命令来构建索引:

E:\Sphinx\sphinx-3.2.1\bin\indexer –all –config
E:\sphinx\sphinx-3.2.1\etc\sphinx.conf –rotate –print-queries

E:\ Sphinx \ sphinx-3.2.1 \ bin \ indexer –all –config
E:\ sphinx \ sphinx-3.2.1 \ etc \ sphinx.conf –旋转–打印查询

If you are using the Sphinx version 3.2.1 while executing the command above, you may encounter the following error:

如果在执行上述命令时使用的是Sphinx版本3.2.1,则可能会遇到以下错误:

“The code execution cannot proceed because ssleay32.dll was not found”

“由于找不到ssleay32.dll,因此无法执行代码”

Indexer service throwing an exception

This error cause is that there are three missing assemblies in this release. To solve this problem, you can download a previous release (3.1.1) and copy the following assemblies from the bin directories:

导致此错误的原因是此版本中缺少三个程序集。 要解决此问题,可以下载早期版本(3.1.1),然后从bin目录复制以下程序集:

  • libeay32.dll

    libeay32.dll
  • msvcr120.dll

    msvcr120.dll
  • ssleay32.dll

    ssleay32.dll

After copying these assemblies, if we try to re-execute the command above, we will receive the following message (as shown in the image below)

复制这些程序集后,如果我们尝试重新执行上面的命令,我们将收到以下消息(如下图所示)

“FATAL: no indexes found in config file”

“致命:在配置文件中找不到索引”

Which means that the indexer is started successfully, but it didn’t find a real index.

这意味着索引器已成功启动,但是找不到真正的索引。

Indexer service output message

Now, if we try to start the SphinxSearch Windows service, it will start successfully.

现在,如果我们尝试启动SphinxSearch Windows服务,它将成功启动。

一些有用的命令 (Some useful commands)

The following table contains some important commands:

下表包含一些重要的命令:

Command

Description

E:\Sphinx\sphinx-3.2.1\bin\searchd -h

The Searchd tool help command, it is used to see all available options

E:\Sphinx\sphinx-3.2.1\searchd.exe –config E:\Sphinx\sphinx-3.2.1\etc\sphinx.conf

Start the Searchd daemon using the specified configuration file

E:\Sphinx\sphinx-3.2.1\bin\searchd.exe –config E:\Sphinx\sphinx-3.2.1\sphinx.conf –logdebug

Start the Searchd daemon using the specified configuration file with logging enabled

E:\Sphinx\sphinx-3.2.1\bin\searchd -–servicename SphinxSearch –delete

Delete the existing SphinxSearch Windows service

命令

描述

E:\ Sphinx \ sphinx-3.2.1 \ bin \ searchd -h

搜索工具帮助命令,用于查看所有可用选项

E:\ Sphinx \ sphinx-3.2.1 \ searchd.exe –config E:\ Sphinx \ sphinx-3.2.1 \ etc \ sphinx.conf

使用指定的配置文件启动Searchd守护程序

E:\ Sphinx \ sphinx-3.2.1 \ bin \ searchd.exe –config E:\ Sphinx \ sphinx-3.2.1 \ sphinx.conf –logdebug

使用指定的配置文件并启用日志记录来启动Searchd守护程序

E:\ Sphinx \ sphinx-3.2.1 \ bin \ searchd-服务名称SphinxSearch –删除

删除现有的SphinxSearch Windows服务

使用MySQL控制台客户端连接到Sphinx (Connecting to Sphinx using MySQL console client)

Since Sphinx supports the MySQL protocol, so we can use the MySQL console client to connect to Sphinx and execute commands.

由于Sphinx支持MySQL协议,因此我们可以使用MySQL控制台客户端连接到Sphinx并执行命令。

First, we need to download and install the MySQL database engine on the local machine. You can download the MySQL community server from the following link.

首先,我们需要在本地计算机上下载并安装MySQL数据库引擎。 您可以从以下链接下载MySQL社区服务器。

After installing MySQL Server, open the Windows command line and go to the MySQL binaries directory (in this example the directory is “C:\Program Files\MySQL\MySQL Server 8.0\bin”), and use the MySQL client to connect to localhost port 9306 (specified in the Sphinx configuration file) using the following command:

安装MySQL Server之后,打开Windows命令行并转到MySQL Binaries目录(在此示例中,目录为“ C:\ Program Files \ MySQL \ MySQL Server 8.0 \ bin”),然后使用MySQL客户端连接到本地主机端口9306(在Sphinx配置文件中指定),使用以下命令:

mysql -h 127.0.0.1 -P 9306

mysql -h 127.0.0.1 -P 9306

Connecting to Sphinx search engine using MySQL client

As shown in the image above, the server version mentioned in the command prompt output is the Sphinx search engine version (3.2.1-dev (commit f152e0b8)), which means that the connection is established successfully.

如上图所示,命令提示符输出中提到的服务器版本是Sphinx搜索引擎版本(3.2.1-dev(commit f152e0b8)),这意味着连接已成功建立。

Now, let’s try to execute the “show status” command to view the server status; the result is as shown in the image below:

现在,让我们尝试执行“显示状态”命令以查看服务器状态; 结果如下图所示:

Executing show status command

在线资源 (Online Resources)

The downside of the Sphinx is that it does not have sufficient online resources. There are two main resources where you can get useful information:

Sphinx的缺点是它没有足够的在线资源。 您可以从以下两个主要资源中获得有用的信息:

  1. The official documentation: Where all Sphinx features and tools are explained 官方文档 :所有Sphinx功能和工具的说明
  2. SphinxWiki: This page contains a lot of Sphinx related topics and resources SphinxWiki :此页面包含许多与Sphinx相关的主题和资源
  3. Introduction to search with Sphinx book: A concise introduction to Sphinx that shows how to use this tool to index data and provide fast results to both simple and complex searches
  4. Sphinx搜索简介书Sphinx的简要介绍,说明如何使用此工具为数据建立索引并为简单和复杂搜索提供快速结果

结论 (Conclusion)

In this article, we talked about the Sphinx search engine and why it is developed. Then, we explained how to download and set up this tool on Windows. Finally, we illustrated how to use the MySQL client console to connect to the Sphinx engine.

在本文中,我们讨论了Sphinx搜索引擎及其开发原因。 然后,我们解释了如何在Windows上下载和设置此工具。 最后,我们说明了如何使用MySQL客户端控制台连接到Sphinx引擎。

In the next article in this series, we will talk in detail about Sphinx configuration files, and we will explain how to use it to build full-text catalogs from SQL Server databases.

在本系列的下一篇文章中,我们将详细讨论Sphinx配置文件,并说明如何使用它从SQL Server数据库中构建全文目录。

翻译自: https://www.sqlshack.com/getting-started-with-sphinx-search-engine/

sphinx 入门

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值