lucene 的特性
lucene 通过简单的api提供有力的特性
可扩展,高效的索引能力
1.机械硬盘上150GB每小时的索引能力
2.很少的RAM内存需求 -只需要1MB的堆内存
3.增量索引的速度和批量索引的速度一样快
4.索引大小大致是文本索引大小的20%到30%
有力,准确和有效的搜索算法
1.排序搜索 -- 优先返回最好的结果
2.很多有力的查询类型:phrase queries(短语搜索),wildcard queries(模糊搜索),proximity queries(距离查询),range queries(范围搜索) 等等
3.提供字段搜索(例如title,author,contents)
4.可以按照任意字段排序
5.合并结果的复杂查询
6.允许同时发生更新和查询
7.灵活的维度查询(flexible faceting),高亮,连接和结果集分组
8.快速,内存高效的错误提示
9.插件式排序模型,包括Vector Space Model (向量空间模型)和Okapi BM25(TF/IDF)
10.可配置式的存储引擎(codecs)
跨平台的解决方案
1.作为一个开源软件可以在Apache Licenese协议下让你在商业或者开源软件中使用lucene
2.100%纯java
3.其他编程语言实现的索引之间是可兼容的
LuceneTM Features
Lucene offers powerful features through a simple API:
Scalable, High-Performance Indexing
- over 150GB/hour on modern hardware
- small RAM requirements -- only 1MB heap
- incremental indexing as fast as batch indexing
- index size roughly 20-30% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
- ranked searching -- best results returned first
- many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
- fielded searching (e.g. title, author, contents)
- sorting by any field
- multiple-index searching with merged results
- allows simultaneous update and searching
- flexible faceting, highlighting, joins and result grouping
- fast, memory-efficient and typo-tolerant suggesters
- pluggable ranking models, including the Vector Space Model and Okapi BM25
- configurable storage engine (codecs)
Cross-Platform Solution
- Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
- 100%-pure Java
- Implementations in other programming languages available that are index-compatible