以太坊源码分析(30)eth-bloombits和filter源码分析

以太坊源码解析：eth-bloombits与filter
最新推荐文章于 2025-06-13 00:00:00 发布
原创
最新推荐文章于 2025-06-13 00:00:00 发布 · 3k 阅读
1 ·
CC 4.0 BY-SA版权
本文深入分析了以太坊源码中的eth-bloombits和filter相关部分，旨在理解其工作原理。通过阅读文章，可以获取到以太坊区块链中过滤器和位图操作的关键信息。
 
  ## 以太坊的布隆过滤器 
 
   以太坊的区块头中包含了一个叫做logsBloom的区域。 这个区域存储了当前区块中所有的收据的日志的布隆过滤器，一共是2048个bit。也就是256个字节。 
 
   而我们的一个交易的收据包含了很多的日志记录。 每个日志记录包含了 合约的地址， 多个Topic。 而在我们的收据中也存在一个布隆过滤器，这个布隆过滤器记录了所有的日志记录的信息。 
 
   如果我们看黄皮书里面对日志记录的形式化定义。 
 
   O代表我们的日志记录，Oa代表logger的地址，Oto,Ot1代表日志的Topics， Od代表时间。 
 
   Oa是20个字节，Ot是32个字节，Od是很多字节 
 
   我们定义了一个布隆过滤器函数M，用来把一个日志对象转换成256字节的hash 
 
   M3:2045是一个特别的函数，用来设置2048个bit位中的三位为1。 
 
   对于任意的输入值，首先求他的KEC输出， 然后通过取KEC输出的 [0,1] [2,3],[4,5] 这几位的值 对2048取模， 得到三个值， 这三个值就是输出的2048中需要置位的下标。 也就是说对于任何一个输入，如果它对应的三个下标的值不都为1，那么它肯定不在这个区块中。 当如如果对应的三位都为1，也不能说明一定在这个区块中。 这就是布隆过滤器的特性。 
 
   收据中的布隆过滤器就是所有的日志的布隆过滤器输出的并集。 
 
   同时区块头中的logBloom，就是所有的收据的布隆过滤器的并集。 
 
  ## ChainIndexer 和 BloomIndexer 
 
   最开始看到ChainIndexer，不是很明白是什么功能。 其实从名字中可以看到，是Chain的索引。 在 eth中我们有看到BloomIndexer,这个就是布隆过滤器的索引。 
 
   在我们的协议中提供了查找指定Log的功能。 
 
   用户可以通过传递下面的参数来查找指定的Log,开始的区块号，结束的区块号， 根据合约 Addresses指定的地址过滤，根据指定的Topics来过滤。 
 
       // FilterCriteria represents a request to create a new filter. 
 
       type FilterCriteria struct { 
   
           FromBlock *big.Int 
 
           ToBlock *big.Int 
 
           Addresses []common.Address 
 
           Topics [][]common.Hash 
 
       } 
 
   如果开始和结束之间间隔很大，那么如果直接依次检索每个区块头的logBloom区域是比较低效的。 因为每个区块头都是分开存储的， 可能需要非常多的磁盘随机访问。 
 
   所以以太坊协议在本地维护了一套索引，用来加速这个过程。 
 
   大致原理是。 每4096个区块称为一个Section，一个Section里面的logBloom会存储在一起。对于每个Section, 用一个二维数据，A[ 
  2048][4096]来存储。 第一维2048代表了bloom过滤器的长度2048个字节。 第二维4096代表了一个Section里面的所有区块，每一个位置按照顺序代表了其中的一个区块。 
 
  - A[ 
  0][0]=blockchain[section*4096+0].logBloom[0], 
 
  - A[ 
  0][1]=blockchain[section*4096+1].logBloom[0], 
 
  - A[ 
  0][4096]=blockchain[section*4096+1].logBloom[0], 
 
  - A[ 
  1][0]=blockchain[section*4096+0].logBloom[1], 
 
  - A[ 
  1][1024]=blockchain[section*4096+1024].logBloom[1], 
 
  - A[ 
  2047][1]=blockchain[section*4096+1].logBloom[2047], 
 
   如果Section填充完毕，那么会写成2048个KV。 
 
   ![ 
  image](picture/bloom_6.png) 
 
  ## bloombit.go 代码分析 
 
   这个代码相对不是很独立，如果单独看这个代码，有点摸不着头脑的感觉， 因为它只是实现了一些接口，具体的处理逻辑并不在这里，而是在core里面。 不过这里我先结合之前讲到的信息分析一下。 后续更详细的逻辑在分析core的代码的时候再详细分析。 
 
   服务线程startBloomHandlers,这个方法是为了响应具体的查询请求， 给定指定的Section和bit来从levelDB里面查询然后返回出去。 单独看这里有点摸不着头脑。 这个方法的调用比较复杂。 涉及到core里面的很多逻辑。 这里先不细说了。 直到有这个方法就行了。 
 
       type Retrieval struct { 
   
           Bit uint           //Bit的取值 0-2047 代表了想要获取哪一位的值 
 
           Sections []uint64       // 那些Section 
 
           Bitsets [][]byte       // 返回值 查询出来的结果。 
 
       } 
 
       // startBloomHandlers starts a batch of goroutines to accept bloom bit database 
 
       // retrievals from possibly a range of filters and serving the data to satisfy. 
 
       func (eth *Ethereum) startBloomHandlers() { 
   
           for i := 0; i < bloomServiceThreads; i++ { 
   
               go func() { 
   
                   for { 
   
                       select { 
   
                       case <-eth.shutdownChan: 
 
                           return 
 
                       case request := <-eth.bloomRequests: // request是一个通道 
 
                           task := <-request //从通道里面获取一个task 
 
                           task.Bitsets = make([][]byte, len(task.Sections)) 
 
                           for i, section := range task.Sections { 
   
                               head := core.GetCanonicalHash(eth.chainDb, (section+1)*params.BloomBitsBlocks-1) 
 
                               blob, err := bitutil.DecompressBytes(core.GetBloomBits(eth.chainDb, task.Bit, section, head), int(params.BloomBitsBlocks)/8) 
 
                               if err != nil { 
   
                                   panic(err) 
 
                               } 
 
                               task.Bitsets[i] = blob 
 
                           } 
 
                           request <- task //通过request通道返回结果 
 
                       } 
 
                   } 
 
               }() 
 
           } 
 
       } 
 
  ### 数据结构 
 
   BloomIndexer对象主要用户构建索引的过程，是core.ChainIndexer的一个接口实现，所以只实现了一些必须的接口。对于创建索引的逻辑还在core.ChainIndexer里面。 
 
       // BloomIndexer implements a core.ChainIndexer, building up a rotated bloom bits index 
 
       // for the Ethereum header bloom filters, permitting blazing fast filtering. 
 
       type BloomIndexer struct { 
   
           size uint64 // section size to generate bloombits for 
 
           db ethdb.Database // database instance to write index data and metadata into 
 
           gen *bloombits.Generator // generator to rotate the bloom bits crating the bloom index 
 
           section uint64 // Section is the section number being processed currently 当前的section 
 
           head common.Hash // Head is the hash of the last header processed 
 
       } 
 
       // NewBloomIndexer returns a chain indexer that generates bloom bits data for the 
 
       // canonical chain for fast logs filtering. 
 
       func NewBloomIndexer(db ethdb.Database, size uint64) *core.ChainIndexer { 
   
           backend := &BloomIndexer{ 
   
               db: db, 
 
               size: size, 
 
           } 
 
           table := ethdb.NewTable(db, string(core.BloomBitsIndexPrefix)) 
 
           return core.NewChainIndexer(db, table, backend, size, bloomConfirms, bloomThrottling, "bloombits") 
 
       } 
 
   Reset实现了ChainIndexerBackend的方法，启动一个新的section 
 
       // Reset implements core.ChainIndexerBackend, starting a new bloombits index 
 
       // section. 
 
       func (b *BloomIndexer) Reset(section uint64) { 
   
           gen, err := bloombits.NewGenerator(uint(b.size)) 
 
           if err != nil { 
   
               panic(err) 
 
           } 
 
           b.gen, b.section, b.head = gen, section, common.Hash{} 
 
       } 
 
   Process实现了ChainIndexerBackend， 增加一个新的区块头到index 
 
       // Process implements core.ChainIndexerBackend, adding a new header's bloom into 
 
     &