Lucene4.0 FilterIndexReader

最新推荐文章于 2025-11-26 23:05:48 发布

最新推荐文章于 2025-11-26 23:05:48 发布 · 162 阅读

文章标签：

#开发工具

搜索相关专栏收录该内容

12 篇文章

订阅专栏

此文章介绍了如何通过在LUCENE-2919中实现FilterIndexReader子类来优化全文检索性能，该类在索引打开前即应用过滤器，提高效率并简化搜索过程。

When coding ~~LUCENE-2919~~ (PKIndexSplitter), Mike and me had the idea, how to effectively apply filters on the lowest level (before query execution). This is very useful for e.g. security Filters that simply hide some documents. Currently when you apply the filter after searching, lots of useless work was done like scoring filtered documents, iterating term positions (for Phrases),...

This patch will provide a FilterIndexReader subclass (4.0 only, 3.x is too complicated to implement), that hides filtered documents by returning them in getDeletedDocs(). In contrast to ~~LUCENE-2919~~, the filtering will work on per-segment (without SlowMultiReaderWrapper), so per segment search keeps available and reopening can be done very efficient, as the filter is only calculated on openeing new or changed segments.

This filter should improve use-cases where the filter can be applied one time before all queries (like security filters) on (re-)opening the IndexReader.