lucene特性

最新推荐文章于 2024-06-20 10:09:55 发布

thisisnew

最新推荐文章于 2024-06-20 10:09:55 发布

阅读量295

点赞数

分类专栏： java

本文链接：https://blog.youkuaiyun.com/weu135/article/details/105899569

版权

29 篇文章

订阅专栏

Lucene是一个强大的全文搜索引擎库，提供高性能的索引能力，如150GB/hour的索引速度，仅需1MB的堆内存。其搜索算法精确有效，支持多种查询类型和字段搜索，还能进行排序、分面搜索和结果集分组。此外，Lucene是跨平台的，100%纯Java，并与其他编程语言的实现兼容。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

lucene 的特性

lucene 通过简单的api提供有力的特性

可扩展，高效的索引能力

1.机械硬盘上150GB每小时的索引能力

2.很少的RAM内存需求 -只需要1MB的堆内存

3.增量索引的速度和批量索引的速度一样快

4.索引大小大致是文本索引大小的20%到30%

有力，准确和有效的搜索算法

1.排序搜索 -- 优先返回最好的结果

2.很多有力的查询类型:phrase queries(短语搜索),wildcard queries(模糊搜索),proximity queries(距离查询),range queries(范围搜索) 等等

3.提供字段搜索（例如title,author,contents）

4.可以按照任意字段排序

5.合并结果的复杂查询

6.允许同时发生更新和查询

7.灵活的维度查询(flexible faceting)，高亮，连接和结果集分组

8.快速，内存高效的错误提示

9.插件式排序模型，包括Vector Space Model (向量空间模型)和Okapi BM25(TF/IDF)

10.可配置式的存储引擎(codecs)

跨平台的解决方案

1.作为一个开源软件可以在Apache Licenese协议下让你在商业或者开源软件中使用lucene

2.100%纯java

3.其他编程语言实现的索引之间是可兼容的

LuceneTM Features

Lucene offers powerful features through a simple API:

Scalable, High-Performance Indexing

Powerful, Accurate and Efficient Search Algorithms

ranked searching -- best results returned first
many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
fielded searching (e.g. title, author, contents)
sorting by any field
multiple-index searching with merged results
allows simultaneous update and searching
flexible faceting, highlighting, joins and result grouping
fast, memory-efficient and typo-tolerant suggesters
pluggable ranking models, including the Vector Space Model and Okapi BM25
configurable storage engine (codecs)

Cross-Platform Solution

Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
100%-pure Java
Implementations in other programming languages available that are index-compatible