文档的CRUD
Elasticsearch通过REST API进行文档的CRUD
操作 | API示例 | 备注 |
---|---|---|
Index | PUT my_index/_doc/1 {“user”:“mike”,“comment”:“评论”} | Type名,约定都用_doc Index:如果ID不存在,创建新的文档,否则,先删除现有的文档,再创建新的文档,版本会增加 |
Create | PUT my_index/_create/1 {“user”:“mike”,“comment”:“评论”} PUT my_index/ _doc(不指定ID,自动生成){“user”:“mike”,“comment”:“评论”} | Create 如果ID已经存在,会失败 |
Read | GET my_indx/_doc/1 | 获取对应id文档的数据 |
Update | POST my_index/_update/1 {“user”:“mike”,“comment”:“评论”} | Update:文档必须已经存在,更新只会对相应字段做增量修改 |
Delete | DELETE my_index/_doc/1 | 删除对应id文档 |
测试API
在kibana的侧边栏有个开发工具,可以方便我们测试Elasticsearch的API
我们在之前导入过一个电影的索引数据,现在我们以它为例,测试下相关的API
Index 操作
PUT /movies/_doc/1
{
"genre" : [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
],
"year" : 1995,
"id" : "1",
"@version" : "1",
"title" : "Toy Story"
}
响应结果如下:
{
"_index" : "movies",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 9743,
"_primary_term" : 1
}
Create操作
PUT /movies/_create/1 #创建文档,如果文档存在会报错
{
"genre" : [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
],
"year" : 1995,
"id" : "1",
"@version" : "1",
"title" : "Toy Story"
}
响应结果如下:
{
"error" : {
"root_cause" : [
{
"type" : "version_conflict_engine_exception",
"reason" : "[1]: version conflict, document already exists (current version [2])",
"index_uuid" : "Km35tJDlTPGjIjnmNgPQ4A",
"shard" : "0",
"index" : "movies"
}
],
"type" : "version_conflict_engine_exception",
"reason" : "[1]: version conflict, document already exists (current version [2])",
"index_uuid" : "Km35tJDlTPGjIjnmNgPQ4A",
"shard" : "0",
"index" : "movies"
},
"status" : 409
}
Read操作
GET movies/_doc/1 # 获取movies索引中id为1的文档数据
响应结果如下:
{
"_index" : "movies",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 128,
"_primary_term" : 1,
"found" : true,
"_source" : {
"genre" : [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
],
"year" : 1995,
"id" : "1",
"@version" : "1",
"title" : "Toy Story"
}
}
Update 操作
# update操作可以修改原来字段的值,也可以新增字段,但整体请求需要包裹在doc下面
POST movies/_update/1
{
"doc": {
"genre" : [
"Adventure1",
"Animation1",
"Children1",
"Comedy1",
"Fantasy1"
],
"year" : 1995,
"id" : "1",
"@version" : "2",
"title" : "Toy Story"
}
}
响应结果如下:
{
"_index" : "movies",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 9744,
"_primary_term" : 1
}
Delete 操作
DELETE /movies/_doc/1
响应结果如下:
{
"_index" : "movies",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 9745,
"_primary_term" : 1
}
Bulk API
Rest API在每次发烧请求的时候,需要重新建立连接,是很消耗性能的,能不能在一个请求中,对一个索引进行不同操作,乃至对不同索引进行不同操作,这就是Bulk API
Bulk API支持四种类型操作:
1. Index
2. Create
3. Update
4. Delete
特点:
1. Bulk API可以在URL中知道Index,也可以在请求体中进行指定
2. 单条操作失败,并不会影响其他操作结果
3. 返回结果包含了每一条操作执行的结果
示例
POST _bulk
# 创建一个索引名为test,文档id为1,数据为{"field1":"value1"}的索引
{"index":{"_index":"test","_id":"1"}}
{"field1":"value1"}
# 删除test索引中id为2的文档(此时还不存在,会报错,但不影响别的执行成功)
{"delete":{"_index":"test","_id":2}}
# 创建一个索引名为test2,文档id为3,数据为{"field1":"value3"}的索引
{"create":{"_index":"test2","_id":"3"}}
{"field1":"value3"}
# 将索引为test2的id为的文档更新为新数据
{"update":{"_index":"test","_id":"1"}}
{"doc":{"field1":"value2"}}
响应结果如下:
对于每个操作,都会有一个响应结果
{
"took" : 522,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"delete" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1,
"status" : 404
}
},
{
"create" : {
"_index" : "test2",
"_type" : "_doc",
"_id" : "3",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"update" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 4,
"_primary_term" : 1,
"status" : 200
}
}
]
}
批量读取 mget
# 1. 批量操作,可以减少网络连接所产生的开销,提高性能
# 读取test索引和test2索引,id都为1的文档
GET _mget
{
"docs":[
{"_index":"test",
"_id":"1"
},
{
"_index":"test2",
"_id":"1"
}
]
}
响应结果如下:
{
"docs" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"_seq_no" : 4,
"_primary_term" : 1,
"found" : true,
"_source" : {
"field1" : "value2"
}
},
{
"_index" : "test2",
"_type" : "_doc",
"_id" : "1",
"found" : false
}
]
}
Search API
1. 查询是一大块内容,碍于篇幅限制,我将在下个章节详细介绍
常见错误返回
问题 | 原因 |
---|---|
无法连接 | 网络故障或者集群挂了 |
连接无法关闭 | 网络故障或者节点出错 |
429 | 集群过于繁忙 |
4xx | 请求体格式有错 |
500 | 集群内部错误 |
更多内容欢迎关注我的个人公众号“韩哥有话说”,100G人工智能学习资料,大量后端学习资料等你来拿。