Linux 全文搜索引擎 Sphinx 使用

Sphinx是一款高效的全文搜索引擎,同类的有Lucene, Xapian等。顺便提下记得douban的一次演讲中就提到Sphinx不能满足需求而转向Xapian。不过以我用Sphinx的经验来看,一般应用还是手到摛来。

一、下载Sphinx

当前的Release版本为0.9.9, 不过有编译上的小bug  http://sphinxsearch.com/bugs/view.php?id=453 ,不过很容易修复,当然您可以选择使用最新的1.10-beta版。
wget http://sphinxsearch.com/files/sphinx-0.9.9.tar.gz

二、安装Sphinx

因为有可能会需要支持postgresql所以加上--with-pgsql,如果不需要pgsql支持的不要加,以免出现依赖问题。
1. 通用安装
./configure --with-pgsql
make
make install
cd api/libsphinxclient
#sed这行仅适用于0.9.9
sed -ie '280s/^/static /' sphinxclient.c 
./configure
make
make install
2. ArchLinux下的安装
因为我本机上系统为ArchLinux,为了保持系统整洁便于管理,可以制作了一个ArchLinux的PKGBUILD
# Contributor: Jiang Miao
pkgname=sphinx
pkgver=0.9.9
pkgrel=1
pkgdesc='Sphinx full search engine'
arch=(i686 x86_64)
license=('GPL')
source=(http://sphinxsearch.com/files/$pkgname-$pkgver.tar.gz)
md5sums=('7b9b618cb9b378f949bb1b91ddcc4f54')
# avoid make[1]: *** No rule to make target `.libs/libsphinxclient.a', needed by `test'. Stop.
# see https://bbs.archlinux.org/viewtopic.php?id=77214
options=('!makeflags')

build() {
  cd $startdir/src/$pkgname-$pkgver
  ./configure --prefix=/usr \
              --sysconfdir=/etc/sphinx \
              --localstatedir=/var/lib/sphinx \
              --with-pgsql
  make || return 1
  make DESTDIR=$startdir/pkg install || return 1

  cd api/libsphinxclient
  # fix bug 'error: static declaration of 'sock_close' follows non-static declaration'
  # see http://sphinxsearch.com/bugs/view.php?id=453
  sed -ie '280s/^/static /' sphinxclient.c 
  ./configure --prefix=/usr
  make || return 1
  make DESTDIR=$startdir/pkg install || return 1
}
编译打包并安装
makepkg -c
sudo pacman -U sphinx-0.9.9-1-i686.pkg.tar.xz 

三、测试sphinx

1. 导入使用sphinx的测试数据到表sphinx_test
$ mysql -uroot -p
mysql> CREATE DATABASE IF NOT EXISTS test;
Query OK, 1 row affected (0.00 sec)

mysql> source /etc/sphinx/example.sql
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.13 sec)

Query OK, 4 rows affected (0.10 sec)
Records: 4  Duplicates: 0  Warnings: 0

Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.14 sec)

Query OK, 10 rows affected (0.13 sec)
Records: 10  Duplicates: 0  Warnings: 0
2. 创建sphinx.conf
相关的columns为
id int
title VARCHAR(255)
content TEXT

sphinx.conf为
source test {
  type = mysql
  sql_host = localhost
  sql_user = test
  sql_pass = test_password
  sql_db   = test

  sql_query = SELECT id, title, content FROM documents
  sql_query_info = SELECT * FROM documents WHERE id = $id
}

index test {
  source = test
  path = ./data/test
}

searchd {
  pid_file    = ./run/searchd.pid
  log         = ./log/searchd.log
  query_log   = ./log/query.log
  max_matches = 1000
}
3. 使用indexer创建索引
$ mkdir data log run
$ indexer test
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff

using config file './sphinx.conf'...
indexing index 'test'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 193 bytes
total 0.019 sec, 10074 bytes/sec, 208.80 docs/sec
total 1 reads, 0.000 sec, 0.2 kb/call avg, 0.0 msec/call avg
total 5 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
4. 测试查询my test
$ search my test
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff

using config file './sphinx.conf'...
index 'test': query 'my test ': returned 2 matches of 2 total in 0.000 sec

displaying matches:
1. document=1, weight=3
	id=1
	group_id=1
	group_id2=5
	date_added=2011-02-24 19:49:41
	title=test one
	content=this is my test document number one. also checking search within phrases.
2. document=2, weight=3
	id=2
	group_id=1
	group_id2=6
	date_added=2011-02-24 19:49:41
	title=test two
	content=this is my test document number two

words:
1. 'my': 2 documents, 2 hits
2. 'test': 3 documents, 5 hits

四、PHP调用Sphinx API

1. 安装sphinx php pecl api
pecl install sphinx
# 加入sphinx扩展到配置文件
echo 'extension=sphinx.so' > /etc/php/conf.d/sphinx.ini
# 重启php-cgi,依环境的不同而不同,我这里是lighttpd
/etc/rc.d/lighttpd restart
2. 编写测试文件test.php
<?php
$cl = new SphinxClient();
$result = $cl->query("my test");
if ($result === false) {
  die($cl->getLastError());
}
print_r($result);
3. 启动searchd
$ searchd
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff

using config file './sphinx.conf'...
listening on all interfaces, port=9312
4. 运行test.php查看效果
$ php test.php
Array
(
    [error] => 
    [warning] => 
    [status] => 0
    [fields] => Array
        (
            [0] => title
            [1] => content
        )

    [attrs] => Array
        (
        )

    [matches] => Array
        (
            [1] => Array
                (
...

五、相关链接

Sphinx 官网
Sphinx 文档
Sphinx 0.9.9 文档
Sphinx PHP API文档
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值