[zz] Lucene goodness

Lucene 2.3 版本在内存管理和性能方面进行了重大改进,单线程环境下索引速度从每秒 400 条记录提升到超过 2,100 条。新版本提供了更直观的内存配置方式,并优化了 IndexReader 的重新打开过程。此外,还包含了更快的 StandardTokenizer 和术语向量访问等功能。

Lucene goodness

Lots of good things happening in Lucene land lately, all of which should benefit users with faster indexing and searching capabilities.  Most notably, Lucene 2.3 (hopefully released this quarter) has some major changes in indexing memory management and performance.  I have personally clocked indexing using release 2.2 at about 400 rec/s (single threaded, Mac Pro dual CPU/dual core, using the contrib/benchmark indexing.alg) to over 2,100 records/s on 2.3-dev (the latest trunk).  It also features easier control of the indexing process by specifying how much memory to give it, instead of the confusing maxBufferedDocs factor.

Other work being undertaken should speed up reopening IndexReader’s.  There also are a number of smaller changes including a faster StandardTokenizer (the tokenizer most people use) and faster term vector access.

Of course, with that comes more testing and a greater need to make sure the next release is rock solid and backwards compatible.   So, if you are a Lucene user, I would encourage you to give trunk a try on some of your non-production indexes, etc. and help us test it out.

 

link from http://lucene.grantingersoll.com/2007/11/02/lucene-goodness/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值