elasticsearch 5.2.2 学习笔记之源码阅读10-预处理插件(IngestPlugin)之管道与处理器
概述
预处理插件(IngestPlugin)之管道与处理器
{
"description" : "describe pipeline",
"processors" : [
{
"set" : {
"field": "foo",
"value": "bar"
}
}
]
}
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"set" : {
"field" : "field2",
"value" : "_value"
}
},
{
"set" : {
"field" : "field3",
"value" : "_value"
}
}
]
},
"docs": [
{
"_index": "index",
"_type": "type",
"_id": "id",
"_source": {
"foo": "bar"
}
}
]
}
{
"date" : 12345,
"user" : "chenlin7",
"mesg" : "好好学习,天天向上,Elasticsearch,first message into Elasticsearch"
}
核心接口与类
- 使用AbstractComponent和AbstractLifecycleComponent管理核心组件的配置与生命周期
- AbstractComponent
- AbstractLifecycleComponent
- PipelineStore(将创建的管道添加到集群状态中,更新集群状态)
- LifecycleComponent
- IngestService(用于预处理相关的支持类)
- IngestPlugin(插件扩展,用于加载自定义处理器)
- IngestCommonPlugin(es默认的预处理插件)
- Pipeline(拥有一系列的处理器和description/processors/version/on_failure/id属性)
- Pipeline.Factory(根据传入的相关属性创建管道,会验证创建的处理器是否存在等)
- Processor(处理器基类)
- AbstractProcessor
- org.elasticsearch.ingest.CompoundProcessor(组合模式的处理器)
- org.elasticsearch.ingest.common.SetProcessor(添加或者修改字段值)
- overrideEnabled/field/value
- IngestDocument(代表单行文档,拥有其元信息和值/_ingest)
- IngestInfo
- ProcessorInfo
以 创建管道/模拟管道使用 为例讲解加载过程
- node启动查看 https://blog.youkuaiyun.com/undergrowth/article/details/82840411
- 通过PluginsService(https://blog.youkuaiyun.com/undergrowth/article/details/82857089)加载modules目录下的ingest-common插件,
即加载IngestCommonPlugin用于构建IngestService,在创建IngestService时候,
通过IngestCommonPlugin#getProcessors加载内置的DateProcessor/SetProcessor/JsonProcessor等各种处理器 - 创建管道涉及到的Action分别为 PutPipelineAction/RestPutPipelineAction/PutPipelineTransportAction
- RestPutPipelineAction构建PutPipelineRequest请求,添加超时设置,参看前面文章,请求扭转到PutPipelineTransportAction#masterOperation,
先获取集群各节点信息后,通过PipelineStore#put/innerPut先validatePipeline判断,使用Pipeline.Factory#create创建Pipeline,
然后构建新的ClusterState,然后发布集群状态,后续跟前一篇发现模块的集群状态发布流程一致
- 模拟管道使用涉及到的Action分别为 SimulatePipelineAction/RestSimulatePipelineAction/SimulatePipelineTransportAction
- RestSimulatePipelineAction构建SimulatePipelineRequest,
消息扭转SimulatePipelineTransportAction#doExecute后创建SimulatePipelineRequest进行执行 - 后续遍历请求中的IngestDocument,调用SimulateExecutionService#executeDocument使用CompoundProcessor遍历其中的processors进行处理,
例如这里的SetProcessor#execute
以 在添加文档数据时使用管道 为例讲解加载过程