SolrCloud performance test

最新推荐文章于 2016-10-26 16:48:00 发布

原创最新推荐文章于 2016-10-26 16:48:00 发布 · 3.3k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#solr4.0 #solr4.2 #solr4.3 #性能 #性能测试

搜索引擎专栏收录该内容

16 篇文章

订阅专栏

本文介绍了一个SolrCloud集群的性能测试案例，包括环境配置、索引数据准备、测试脚本设计及结果分析。测试结果显示，在进行搜索的同时进行索引操作会显著影响索引性能。

http://wikicentral.cisco.com/display/PROJECT/SolrCloud+Performance+Test

environment|

SolrCloud servers: X.X.X.71, X.X.X.72, X.X.X.73. 72 and 73 with Mem:16G CPU:16 core 2.4GHz; 71 with Mem:8G CPU:4 core 2.27GHz

zookeeper servers: X.X.X.22, X.X.X.23, X.X.X.24

OS: Linux x86_64 GNU/Linux

tools: jmeter2.6 , youykit11.0.5

config & start service

zookeeper run as default parameters and config(zookeeper start)

SlorCloud configeration refer to (solrloud start)

1. for X.X.X.71 add $SOLRCLOUD_HOME/example/solr/conf/schema.xmlwith fields as:

<fields>
   <!-- Valid attributes for fields:
     name: mandatory - the name for the field
     type: mandatory - the name of a previously defined type from the
       <types> section
     indexed: true if this field should be indexed (searchable or sortable)
     stored: true if this field should be retrievable
     multiValued: true if this field may contain multiple values per document
     omitNorms: (expert) set to true to omit the norms associated with
       this field (this disables length normalization and index-time
       boosting for the field, and saves some memory).  Only full-text
       fields or fields that need an index-time boost need norms.
       Norms are omitted for primitive (non-analyzed) types by default.
     termVectors: [false] set to true to store the term vector for a
       given field.
       When using MoreLikeThis, fields used for similarity should be
       stored for best performance.
     termPositions: Store position information with the term vector.
       This will increase storage costs.
     termOffsets: Store offset information with the term vector. This
       will increase storage costs.
     required: The field is required.  It will throw an error if the
       value does not exist
     default: a value that should be used if no value is specified
       when adding a document.
   -->

        <field name="id" type="string" indexed="true" stored="true" required="true" />
        <field name="ts" type="text_general" indexed="true" stored="true"/>
        <field name="name" type="text_general" indexed="true" stored="true"/>
        <field name="age" type="text_general" indexed="true" stored="true"/>
        <field name="company" type="text_general" indexed="true" stored="true"/>
        <field name="branch" type="text_general" indexed="true" stored="true"/>
        <field name="mail" type="text_general" indexed="true" stored="true"/>
        <field name="interest" type="text_general" indexed="true" stored="true"/>
        <field name="address" type="text_general" indexed="true" stored="true"/>
        <field name="text_general" type="text_general" indexed="true" stored="false" multiValued="true" />

 </fields>

change the jetty threads limit:

open $SOLR_HOME/example/etc/jetty.xml ,change maxThreads to 10000

<!-- =========================================================== -->
    <!-- Server Thread Pool                                          -->
    <!-- =========================================================== -->
    <Set name="ThreadPool">
      <!-- Default queued blocking threadpool -->
      <New class="org.eclipse.jetty.util.thread.QueuedThreadPool">
        <Set name="minThreads">10</Set>
        <Set name="maxThreads">10000</Set>
        <Set name="detailedDump">false</Set>
      </New>
    </Set>

{note}

this threads num will effect the pressure test. so a big vaule is necessary

{note}

2. cd to $SOLR_HOME/example

3. start SolrCloud with command on X.X.X.71:

java -Xmx6g -Xms6g -Dbootstrap_confdir=./solr/conf -Dcollection.configName=myconf -Djetty.port=8900 -DzkHost=X.X.X.22:2181,X.X.X.23:2181,X.X.X.24:2181 -DnumShards=1 -jar start.jar

4. start SolrCloud with command on 72 and 73

java -Xmx6g -Xms6g -Djetty.port=8900 -DzkHost=X.X.X.22:2181,X.X.X.23:2181,X.X.X.24:2181 -jar start.jar

5. access solr admin tool http://X.X.X.71:8900/solr/ and will see follow graph:

so finally got one shard and tow replicas

preper jmeter script & index data

1. the index request for preper index-data to solrcloud cluster look like :

POST http://X.X.X.72:8900/solr/collection1/update

POST data:
<add><doc>
<field name="id">4al6q90c-8ouj-1255-Sind-201206282rmo</field>
<field name="ts">1340859312956</field>
<field name="name">byan</field>
<field name="age">21</field>
<field name="company">Ciscobyan Systemsbyan, Incbyan</field>
<field name="branch">Cloudbyan Applicationbyan Servicesbyan</field>
<field name="mail">byan@cisco.com</field>
<field name="interest">Have intensive interest in Internet-surfingbyan,singingbyan, writingbyan and readingbyan </field>
<field name="address">abyan, Gatebyan Buidingbyan Streetbyan Provincebyan Contrybyan</field>
</doc></add>

[no cookies]

Request Headers:
Content-Length: 598
Connection: keep-alive
Content-Type: application/xml

document for indexing:

<add><doc>
  <field name="id">c7${id_counter}</field>
  <field name="ts">1342463467567</field>
  <field name="name">person</field>
  <field name="age">10</field>
  <field name="company">Cisco Systems, Inc.</field>
  <field name="branch">Cloud Application Services</field>
  <field name="mail">person@cisco.com</field>
  <field name="interest">Have intensive interest in Internet-surfing,singing, writing and reading.</field>
  <field name="address">address,The Golden Gate Bridge,Wall Street.</field>
</doc></add>

each doc have a size about 300 bytes

the detail of this script can find here:solr_prepare_data_cluster.jmx

2. after got 15G index data with more than 20 million docs, we are ready to test the performance ofSolrCloud

Test & Result

{note}

all test with NRT availible

{note}

one or more clients to request to X.X.X.71-73 randomly. this test case with 1 shard and tow replicas
1. index performance test (without search): index jmeter script solr_index_cluster.jmx

index result:

request thread nums	avg response time	throughput
50	10	4606
150	32	4612
300	65	4445

2. search performance test(without index):search jmeter script solr_search_cluster.jmx

request thread nums	avg response time	throughput
50	25	1870
150	74	1923
300	156	1709

3. index performance test (with search)

index result:

request thread nums	avg response time	throughput
50	26	1854
150	74	1920
300	151	1856

at the same time the search result:

request thread nums	avg response time	throughput
50	28	1481
150	76	1605
300	139	1470

Summary

the performance of indexing without searching is not bad. But the performance of indexing while searching running is not good. It's look like the searching effect strongly on the performance of indexing. we are digging deeply to find why .