Elastic search 7+为什么去掉了mapping type

最新推荐文章于 2024-12-13 15:35:08 发布

原创最新推荐文章于 2024-12-13 15:35:08 发布 · 762 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#elastic search

本文探讨了Elasticsearch为何移除mapping types概念。主要基于三点：一是type映射中同名字段在内部由同一Lucene字段支持，限制了字段类型的灵活性；二是不同实体在同一个index中导致数据稀疏，影响Lucene的压缩效率；三是ES与SQL数据库的类比误导，表间独立性在ES中不存在。

Why are mapping types being removed?

Initially, we spoke about an “index” being similar to a “database” in an SQL database, and a “type” being equivalent to a “table”.

This was a bad analogy that led to incorrect assumptions. In an SQL database, tables are independent of each other. The columns in one table have no bearing on columns with the same name in another table. This is not the case for fields in a mapping type.

In an Elasticsearch index, fields that have the same name in different mapping types are backed by the same Lucene field internally. In other words, using the example above, the user_namefield in the user type is stored in exactly the same field as the user_name field in the tweet type, and both user_name fields must have the same mapping (definition) in both types.

This can lead to frustration when, for example, you want deleted to be a date field in one type and a boolean field in another type in the same index.

On top of that, storing different entities that have few or no fields in common in the same index leads to sparse data and interferes with Lucene’s ability to compress documents efficiently.

For these reasons, we have decided to remove the concept of mapping types from Elasticsearch.

【以上摘自官网原文】链接

主要意思有3点：

1、我们经常把二维数据库与ES作类比的方式是不正确的假设。把“index”类比为数据库，“type”类比为表。具体原因是，数据库的表是物理独立的，一个表的列跟另外一张表相同名称的列没有关系，而ES中并非如此，不同type映射类型中具有相同名称的字段在内部由相同的Lucene字段支持。

2、当您想要索引一个deleted字段在不同的type中数据类型不一样。一个类型中为日期字段，另外一个类型中为布尔字段时，这可能会导致ES的存储失败,因为这影响了ES的初衷设计。

3、另外，在一个index中建立很多实体，type，没有相同的字段，会导致数据稀疏，最终结果是干扰了Lucene有效压缩文档的能力，说白了就是影响ES的存储、检索效率。