this article provide by my co-worker Miles Li
ElasticSearch VS Solr in searching
1 Solr is slower than ElasticSearch in searching through testing. We try to find root cause.
The following is the testing result , If you are not interested , skip and go to "searching policy"
ElasticSearch testing result
[dynamic query]
$ curl -XPOST 'http://ip:9200 /test/person/_search?search_type=query_then_fetch&size=10' -d '{
fields : ["oid","name"],query:{term:{name:"${ranStr}"}}
}'
search 50 thread in 970.7s = 7588.1/s Avg: 13 Min: 0 Max: 2862 Err: 0 (0.00%)
[static query]
$ curl -XPOST 'http://ip:9200 /test/person/_search?search_type=query_then_fetch&size=10' -d '{
fields : ["oid","name"],query:{term:{name:"chibi"}}
}'
search 50 thread in 180.1s = 7670.4/s Avg: 6 Min: 1 Max: 106 Err: 0 (0.00%)
[static query no found]
$ curl -XPOST 'http://ip:9200 /test/person/_search?search_type=query_then_fetch&size=10' -d '{
fields : ["oid","name"],query:{term:{name:"nofound"}}
}'
search 50 thread in 311.1s = 12239.1/s Avg: 4 Min: 0 Max: 232 Err: 0 (0.00%)
[server status]
Server 203
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26482 root 21 0 6867m 4.8g 10m S 512.1 30.6 83:20.51 java
Server 204
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14670 root 21 0 6903m 4.8g 10m S 522.4 30.6 92:07.23 java
Server 205
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5760 root 24 0 6826m 4.8g 10m S 546.2 30.9 101:12.04 java
[GC status (running time 3.5h)]
server 203
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 76.31 95.79 30.49 59.81 30234 1374.483 31 0.899 1375.383
75.02 0.00 84.31 30.61 59.81 30236 1374.639 31 0.899 1375.539
Solr testing result
[dynamic query]
$ curl -XPOST 'http://ip:9200 /solr/collection1/select?q=name:${name}&fl=name,id'
search 50 thread in 688.2s = 1402.4/s Avg: 35 Min: 3 Max: 1920 Err: 0 (0.00%)
[static query]
$ curl -XPOST 'http://ip:9200 /solr/collection1/select?q=name:chibi&fl=name,id'
2774964 in 395.7s = 7012.5/s Avg: 7 Min: 1 Max: 233 Err: 0 (0.00%)
[static query no found]
$ curl -XPOST 'http://ip:9200 /solr/collection1/select?q=name:nofound&fl=name,id'
3701085 in 528.6s = 7001.6/s Avg: 7 Min: 1 Max: 284 Err: 0 (0.00%)
[server status]
Server 203
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17328 root 17 0 10.9g 8.9g 2.7g S 522.1 56.9 1149:17 java
Server 204
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13654 root 15 0 10.9g 8.9g 2.7g S 504.1 56.8 1158:36 java
Server 205
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4195 root 17 0 14.7g 8.9g 2.7g S 386.4 56.7 771:31.87 java
[GC status (running time 45h)]
Server 205
S0 S1 E O P YGC YGCT FGC FGCT GCT
62.50 62.61 0.00 39.02 59.81 145913 18066.295 2417 655.989 18722.283
2 searching policy
according to the source code of ES.
Maybe the difference between Solr and Es in searching policy implementation lead to gap .
ES have 5 types of the searching policy. Can be dfs_query_then_fetch, dfs_query_and_fetch, query_then_fetch, query_and_fetch. Defaults to query_then_fetch.
Whether ElasticSearch or Solr, they use the searching policy of "query_then_fetch " for mutil shard in a default way.
ES(ES create 1 searchers in 1 query )
query phase return docIds+contextId
fetch phase return docs
free phase free context
Solr (Solr create 2 searchers in 1 query )
query phase return uuids
fetch phase return docs
ES (query_then_fetch) vs SOLR(query_then_fetch)
(This site does not provide Html video insert or iframe embedding function,
we can only provide the video address and screenshots. Sorry for fuzzy video compressed by video site .
If can not visit the video, you can view attachments.ES vs Solr in searching policy.ppt
)
http://player.youku.com/player.php/sid/XNDMxNTU2OTI4/v.swf
We will apply the approach in Solr as same as ES,
then test.
Thanks,
Miles