Global ordinals全局顺序

最新推荐文章于 2024-05-29 12:45:13 发布

silent1

最新推荐文章于 2024-05-29 12:45:13 发布

阅读量1.3k

点赞数

分类专栏： elasticsearch

elasticsearch 专栏收录该内容

8 篇文章

订阅专栏

全局顺序是一种数据结构，基于fielddata和docvalues，为每个唯一词条提供递增编号。此结构仅适用于text和keyword字段，并能提升排序和聚合操作的执行效率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Global ordinals is a data-structure on top of fielddata and doc values, that maintains an incremental numbering for each unique term in a lexicographic order.

全局顺序是一个数据结构，基于fielddata和doc value，该数据结构维护每个词的字母顺序（动态的，incremental numbering）。

Each term has a unique number and the number of term A is lower than the number of term B.

每一个词有一个序号，词A的小于词B。

Global ordinals are only supported on text and keyword fields.

全局顺序只支持text和keyword类型的字段

Fielddata and doc values also have ordinals, which is a unique numbering for all terms in a particular segment and field.

Fielddata和doc value也有顺序，每个词也有一个唯一的序号，该序号是在当前段的当前字段上的序号。

Global ordinals just build on top of this, by providing a mapping between the segment ordinals and the global ordinals, the latter being unique across the entire shard.

全局顺序构建在段内顺序之上，通过建立段内顺序和全局顺序之间的映射，全局顺序在整个切片内是唯一的。（并不是跨切片的？）

Global ordinals are used for features that use segment ordinals, such as sorting and the terms aggregation, to improve the execution time.

全局顺序用于排序和聚合，提升执行性能。

A terms aggregation relies purely on global ordinals to perform the aggregation at the shard level, then converts global ordinals to the real term only for the final reduce phase, which combines results from different shards.

词聚合在切片内完全依赖全局顺序，然后，在最合合并不同切片的结构阶段，把全局顺序转换成实际的词。

Global ordinals for a specified field are tied to all the segments of a shard, while fielddata and doc values ordinals are tied to a single segment. which is different than for field data for a specific field which is tied to a single segment.

全局顺序针对一个切片的所有段的特定字段，而fielddata和doc values针对一个单独的段。

For this reason global ordinals need to be entirely rebuilt whenever a once new segment becomes visible.

因此，当一个新段可见的时候，全局顺序需要重建。

The loading time of global ordinals depends on the number of terms in a field, but in general it is low, since it source field data has already been loaded.

全局顺序的加载时间与一个字段的词的数量相关，但一般来说较短，既然源fielddata已经加载了。

The memory overhead of global ordinals is a small because it is very efficiently compressed. Eager loading of global ordinals can move the loading time from the first search request, to the refresh itself.

因为压缩效率比较高，内存开销也比较小。快加载（eager loading）全局顺序可以把加载时间从第一次搜索提前到每次刷新。

https://www.elastic.co/guide/en/elasticsearch/reference/5.0/fielddata.html#global-ordinals