lucene .doc里存储的skiplist跳表

本文介绍了 Lucene 6.5.1 中 Skip List 的实现及测试方法。通过对比两个 PostingsEnum 对象的文档ID和频率来验证 Skip List 的正确性,并展示了具体的测试代码。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://forfuture1978.iteye.com/blog/546841

见图:

lucene-6.5.1-src/lucene-6.5.1
$ grep "skiplistwriter" * -ril
core/src/java/org/apache/lucene/codecs/lucene50/Lucene50PostingsFormat.java
core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java
core/src/java/org/apache/lucene/codecs/MultiLevelSkipListReader.java
core/src/java/org/apache/lucene/codecs/MultiLevelSkipListWriter.java

 

测试代码位置:

lucene-6.5.1-src/lucene-6.5.1
$ vim core/src/test/org/apache/lucene/codecs/lucene50/TestBlockPostingsFormat3.java

复制代码
  /**
   * checks advancing docs
   */
  public void assertDocsSkipping(int docFreq, PostingsEnum leftDocs, PostingsEnum rightDocs) throws Exception {
    if (leftDocs == null) {
      assertNull(rightDocs);
      return;
    }
    int docid = -1;
    int averageGap = MAXDOC / (1+docFreq);
    int skipInterval = 16;

    while (true) {
      if (random().nextBoolean()) {
        // nextDoc()
        docid = leftDocs.nextDoc();
        assertEquals(docid, rightDocs.nextDoc());
      } else {
        // advance()
        int skip = docid + (int) Math.ceil(Math.abs(skipInterval + random().nextGaussian() * averageGap));
        docid = leftDocs.advance(skip);
        assertEquals(docid, rightDocs.advance(skip));
      }

      if (docid == DocIdSetIterator.NO_MORE_DOCS) {
        return;
      }
      // we don't assert freqs, they are allowed to be different
    }
  }
复制代码

 

复制代码
  /**
   * checks advancing docs + positions
   */
  public void assertPositionsSkipping(int docFreq, PostingsEnum leftDocs, PostingsEnum rightDocs) throws Exception {
    if (leftDocs == null || rightDocs == null) {
      assertNull(leftDocs);
      assertNull(rightDocs);
      return;
    }

    int docid = -1;
    int averageGap = MAXDOC / (1+docFreq);
    int skipInterval = 16;

    while (true) {
      if (random().nextBoolean()) {
        // nextDoc()
        docid = leftDocs.nextDoc();
        assertEquals(docid, rightDocs.nextDoc());
      } else {
        // advance()
        int skip = docid + (int) Math.ceil(Math.abs(skipInterval + random().nextGaussian() * averageGap));
        docid = leftDocs.advance(skip);
        assertEquals(docid, rightDocs.advance(skip));
      }

      if (docid == DocIdSetIterator.NO_MORE_DOCS) {
        return;
      }
      int freq = leftDocs.freq();
      assertEquals(freq, rightDocs.freq());
      for (int i = 0; i < freq; i++) {
        assertEquals(leftDocs.nextPosition(), rightDocs.nextPosition());
        // we don't compare the payloads, it's allowed that one is empty etc
      }
    }
  }
复制代码

 
















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/6808500.html,如需转载请自行联系原作者


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值