solr 局部更新
本文基于solr 7.1
1.Atomic updates
-
set
Set or replace the field value(s) with the specified value(s), or remove the values if ‘null’ or empty list is specified as the new value.May be specified as a single value, or as a list for multiValued fields.
-
add
Adds the specified values to a multiValued field. May be specified as a single value, or as a list.
-
remove
Removes (all occurrences of) the specified values from a multiValued field. May be specified as a single value, or as a list.
-
removeregex
Removes all occurrences of the specified regex from a multiValued field. May be specified as a single value, or as a list.
-
inc
Increments a numeric value by a specific amount. Must be specified as a single numeric value.
示例:
原始文档:
{"id":"mydoc",
"price":10,
"popularity":42,
"categories":["kids"],
"promo_ids":["a123x"],
"tags":["free_to_try","buy_now","clearance","on_sale"]
}
更新命令:
{"id":"mydoc",
"price":{"set":99},
"popularity":{"inc":20},
"categories":{"add":["toys","games"]},
"promo_ids":{"remove":"a123x"},
"tags":{"remove":["free_to_try","on_sale"]}
}
更新结果:
{"id":"mydoc",
"price":99,
"popularity":62,
"categories":["kids","toys","games"],
"tags":["buy_now","clearance"]
}
此方法,在solr的admin控制台可以实现局部数据更新,需要分析如何通过solrj实现类似功能;
2.In-place updates
schema定义如下:
<field name="price" type="float" indexed="false" stored="false" docValues="true"/>
<field name="popularity" type="float" indexed="false" stored="false" docValues="true"/>
原始文档:
{
"id":"mydoc2",
"price":10,
"popularity":42,
"categories":["kids"],
"promo_ids":["a123x"],
"tags":["free_to_try","buy_now","clearance","on_sale"]
}
更新命令:
{
"id":"mydoc2",
"price":{"set":99},
"popularity":{"inc":20}
}
更新结果:
{
"id":"mydoc2",
"price":99,
"popularity":62,
"categories":["kids"],
"promo_ids":["a123x"],
"tags":["free_to_try","buy_now","clearance","on_sale"]
}
但是根据实践,发现schema中的定义,设置indexed,stored为true时,更新后的结果与上述结果相同,具体问题需要再分析;
###Optimistic Concurrency 乐观并发控制
更新时指定字段:_version_
1.如果_version_
>1,更新的文档版本必须与索引中的版本相同,;
2.如果_version_
=1,更新的文档必须存在,如果不存在,不允许更新;
3.如果_version_
<1,更新的文档必须不存在,如果存在,不允许更新;
4.如果_version_
=0,如果文档存在,则文档被更新;如果文档不存在,则插入;
5.如果更新文档时未指定_version_
,并且未指定原子更新条件,则更新时将丢弃原有文档,并插入新文档;
示例:
$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?versions=true' --data-binary '
[ { "id" : "aaa" },
{ "id" : "bbb" } ]'
{"responseHeader":{"status":0,"QTime":6},
"adds":["aaa",1498562471222312960,
"bbb",1498562471225458688]}
$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?_version_=999999&versions=true' --data-binary '
[{ "id" : "aaa",
"foo_s" : "update attempt with wrong existing version" }]'
{"responseHeader":{"status":409,"QTime":3},
"error":{"msg":"version conflict for aaa expected=999999 actual=1498562471222312960",
"code":409}}
$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?_version_=1498562471222312960&versions=true&commit=true' --data-binary '
[{ "id" : "aaa",
"foo_s" : "update attempt with correct existing version" }]'
{"responseHeader":{"status":0,"QTime":5},
"adds":["aaa",1498562624496861184]}
$ curl 'http://localhost:8983/solr/techproducts/query?q=*:*&fl=id,_version_'
{
"responseHeader":{
"status":0,
"QTime":5,
"params":{
"fl":"id,_version_",
"q":"*:*"}},
"response":{"numFound":2,"start":0,"docs":[
{
"id":"bbb",
"_version_":1498562471225458688},
{
"id":"aaa",
"_version_":1498562624496861184}]
}}
参考
https://lucene.apache.org/solr/guide/7_1/updating-parts-of-documents.html