ES中的Get

原创已于 2022-07-09 22:53:45 修改 · 802 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#java

于 2022-07-05 22:35:38 首次发布

ES 专栏收录该内容

50 篇文章

订阅专栏

本文深入探讨了 Elasticsearch 中如何通过内部 Lucene 实现文档 ID 的查询过程。具体讲解了 ShardGetService#innerGet 和 PerThreadIDVersionAndSeqNoLookup#getDocID 方法的工作原理，并解释了如何针对给定 ID 字节获取内部 Lucene docID，包括处理 liveDocs 和多文档匹配的情况。

org.elasticsearch.index.get.ShardGetService#innerGet

org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup#getDocID

    /**
     * returns the internal lucene doc id for the given id bytes.
     * {@link DocIdSetIterator#NO_MORE_DOCS} is returned if not found
     * */
    private int getDocID(BytesRef id, LeafReaderContext context) throws IOException {
        // termsEnum can possibly be null here if this leaf contains only no-ops.
        if (termsEnum != null && termsEnum.seekExact(id)) {
            final Bits liveDocs = context.reader().getLiveDocs();
            int docID = DocIdSetIterator.NO_MORE_DOCS;
            // there may be more than one matching docID, in the case of nested docs, so we want the last one:
            docsEnum = termsEnum.postings(docsEnum, 0);
            for (int d = docsEnum.nextDoc(); d != DocIdSetIterator.NO_MORE_DOCS; d = docsEnum.nextDoc()) {
                if (liveDocs != null && liveDocs.get(d) == false) {
                    continue;
                }
                docID = d;
            }
            return docID;
        } else {
            return DocIdSetIterator.NO_MORE_DOCS;
        }
    }

org.elasticsearch.index.mapper.IdFieldMapper.IdFieldType