【透过配置看本质】Solr Query篇

本文探讨了Solr中的两类主要请求处理器:搜索处理器和更新处理器。重点介绍了搜索处理器的组件,如默认值修饰、常量修饰、后缀修饰、预处理和后处理组件,以及它们在solrconfig.xml中的配置。此外,还提到了Solr查询性能与Searcher的关系,包括各种缓存设置、查询性能参数和预热策略,如maxWarmingSearchers,强调了这些配置对系统性能的影响。

Solr中有两类主要的请求处理器:

  1. 处理查询请求的搜索处理器(SearchHander)
  2. 处理索引请求的更新处理器(UpdateHander)

通常情况下,一个搜索处理器由以下组件组成,其中每个组件都定义在solrconfig.xml文件中。

  1. 请求参数修饰组件,包括
    默认值修饰(defaults)
    常量修饰(invariants)
    后缀修饰(appends)
  2. 预处理组件(first-components)—— 一组优先执行的可选搜索组件,执行预处理任务。
  3. 主搜索组件(components)—— 一组链式组合的搜索组件,至少包含查询组件
  4. 后处理组件(last components)—— 一组可选的链式组合的搜索组件,执行后处理任务。

Solr的Query Performance和Searcher搜索器密切相关。

<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       Query section - these settings control query time things like caches
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
  <query>
    <!-- Max Boolean Clauses

         Maximum number of clauses in each BooleanQuery,  an exception
         is thrown if exceeded.

         ** WARNING **
         
         This option actually modifies a global Lucene property that
         will affect all SolrCores.  If multiple solrconfig.xml files
         disagree on this property, the value at any given moment will
         be based on the last SolrCore to be initialized.
         
      -->
    <maxBooleanClauses>1024</maxBooleanClauses>

 
    <!-- Slow Query Threshold (in millis)
    
         At high request rates, logging all requests can become a bottleneck 
         and therefore INFO logging is often turned off. However, it is still
         useful to be able to set a latency threshold above which a request
         is considered "slow" and log that request at WARN level so we can
         easily identify slow queries.
    --> 
    <slowQueryThresholdMillis>-1</slowQueryThresholdMillis>


    <!-- Solr Internal Query Caches

         There are two implementations of cache available for Solr,
         LRUCache, based on a synchronized LinkedHashMap, and
         FastLRUCache, based on a ConcurrentHashMap.  

         FastLRUCache has faster gets and slower puts in single
         threaded operation and thus is generally faster than LRUCache
         when the hit ratio of the cache is high (> 75%), and may be
         faster under other scenarios on multi-cpu systems.
    -->

    <!-- Filter Cache

         Cache used by SolrIndexSearcher for filters (DocSets),
         unordered sets of *all* documents that match a query.  When a
         new searcher is opened, its caches may be prepopulated or
         "autowarmed" using data from caches in the old searcher.
         autowarmCount is the number of items to prepopulate.  For
         LRUCache, the autowarmed items will be the most recently
         accessed items.

         Parameters:
           class - the SolrCache implementation LRUCache or
               (LRUCache or FastLRUCache)
           size - the maximum number of entries in the cache
           initialSize - the initial capacity (number of entries) of
               the cache.  (see java.util.HashMap)
           autowarmCount - the number of entries to prepopulate from
               and old cache.  
      -->
    <filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>

    <!-- Query Result Cache

        Caches results of searches - ordered lists of document ids
        (DocList) based on a query, a sort, and the range of documents requested.
        Additional supported parameter by LRUCache:
           maxRamMB - the maximum amount of RAM (in MB) that this cache is allowed
                      to occupy
     -->
    <queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>
   
    <!-- Document Cache

         Caches Lucene Document objects (the stored fields for each
         document).  Since Lucene internal document ids are transient,
         this cache will not be autowarmed.  
      -->
    <documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>
    
    <!-- custom cache currently used by block join --> 
    <cache name="perSegFilter"
      class="solr.search.LRUCache"
      size="10"
      initialSize="0"
      autowarmCount="10"
      regenerator="solr.NoOpRegenerator" />

    <!-- Field Value Cache
         
         Cache used to hold field values that are quickly accessible
         by document id.  The fieldValueCache is created by default
         even if not configured here.
      -->
    <!--
       <fieldValueCache class="solr.FastLRUCache"
                        size="512"
                        autowarmCount="128"
                        showItems="32" />
      -->

    <!-- Custom Cache

         Example of a generic cache.  These caches may be accessed by
         name through SolrIndexSearcher.getCache(),cacheLookup(), and
         cacheInsert().  The purpose is to enable easy caching of
         user/application level data.  The regenerator argument should
         be specified as an implementation of solr.CacheRegenerator 
         if autowarming is desired.  
      -->
    <!--
       <cache name="myUserCache"
              class="solr.LRUCache"
              size="4096"
              initialSize="1024"
              autowarmCount="1024"
              regenerator="com.mycompany.MyRegenerator"
              />
      -->


    <!-- Lazy Field Loading

         If true, stored fields that are not requested will be loaded
         lazily.  This can result in a significant speed improvement
         if the usual case is to not load all stored fields,
         especially if the skipped fields are large compressed text
         fields.
    -->
    <enableLazyFieldLoading>true</enableLazyFieldLoading>
    
    <!-- Use DocValue Storage To Get Field Information

         If true, solr will use docvalue to store documents which can 
         result in a significant speed improvement. Or else, it will
         run according to the original method. 
          optimization with solr searching.
    -->    
    <useDocValueGetField>true</useDocValueGetField>
    
    <!-- Sort DocId Before Getting DocList

         If true, solr will sort DocId before getting the document list. 
         Or else, it will run according to the original method.
         SR from C70B005, optimization with solr searching.
    -->        
    <sortDocIdBeforeGetDoc>true</sortDocIdBeforeGetDoc>

    <!-- To ensure that the results are consistent in share filesystem mode.

         If true, the query results will be consistent, whether 
         the request is sent to the leader or other replica.
         If false, The results of each query may not be the same, as the data in non 
         leader replica may not complete, but this mode has the best performance.
    -->        
    <strictQueryOnShareFS>true</strictQueryOnShareFS>

   <!-- Use Filter For Sorted Query

        A possible optimization that attempts to use a filter to
        satisfy a search.  If the requested sort does not include
        score, then the filterCache will be checked for a filter
        matching the query. If found, the filter will be used as the
        source of document ids, and then the sort will be applied to
        that.

        For most situations, this will not be useful unless you
        frequently get the same search repeatedly with different sort
        options, and none of them ever use "score"
     -->
   <!--
      <useFilterForSortedQuery>true</useFilterForSortedQuery>
     -->

   <!-- Result Window Size

        An optimization for use with the queryResultCache.  When a search
        is requested, a superset of the requested number of document ids
        are collected.  For example, if a search for a particular query
        requests matching documents 10 through 19, and queryWindowSize is 50,
        then documents 0 through 49 will be collected and cached.  Any further
        requests in that range can be satisfied via the cache.  
     -->
   <queryResultWindowSize>20</queryResultWindowSize>

   <!-- Maximum number of documents to cache for any entry in the
        queryResultCache. 
     -->
   <queryResultMaxDocsCached>200</queryResultMaxDocsCached>

   <!-- Query Related Event Listeners

        Various IndexSearcher related events can trigger Listeners to
        take actions.

        newSearcher - fired whenever a new searcher is being prepared
        and there is a current searcher handling requests (aka
        registered).  It can be used to prime certain caches to
        prevent long request times for certain requests.

        firstSearcher - fired whenever a new searcher is being
        prepared but there is no current registered searcher to handle
        requests or to gain autowarming data from.

        
     -->
    <!-- QuerySenderListener takes an array of NamedList and executes a
         local query request for each NamedList in sequence. 
      -->
    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <!--
           <lst><str name="q">solr</str><str name="sort">price asc</str></lst>
           <lst><str name="q">rocks</str><str name="sort">weight asc</str></lst>
          -->
      </arr>
    </listener>
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst>
          <str name="q">static firstSearcher warming in solrconfig.xml</str>
        </lst>
      </arr>
    </listener>

    <!-- Use Cold Searcher

         If a search request comes in and there is no current
         registered searcher, then immediately register the still
         warming searcher and use it.  If "false" then all requests
         will block until the first searcher is done warming.
      -->
    <useColdSearcher>false</useColdSearcher>

    <!-- Max Warming Searchers
         
         Maximum number of searchers that may be warming in the
         background concurrently.  An error is returned if this limit
         is exceeded.

         Recommend values of 1-2 for read-only slaves, higher for
         masters w/o cache warming.
      -->
    <maxWarmingSearchers>4</maxWarmingSearchers>

  </query>

maxBooleanClauses

slowQueryThresholdMillis

filterCache

queryResultCache

documentCache

cache

fieldValueCache

enableLazyFieldLoading

useDocValueGetField

sortDocIdBeforeGetDoc

useFilterForSortedQuery

queryResultWindowSize

queryResultMaxDocsCached

listener event=“newSearcher” class="solr.QuerySenderListener"

listener event=“firstSearcher” class="solr.QuerySenderListener"

useColdSearcher
适用于一个搜索请求已经被提交,而目前Solr中没有定义搜索器的情形。
如果该值为false,那么Solr将会一直处于阻塞状态,知道正在预热的搜索器执行完索引的预热查询。
以田径赛场为例,false意味着指令官等到运动员充分热身之后才发出开始指令,不管这需要等多久。

如果该值为true,则Solr会立即使一个正在预热的搜索器进入活跃状态,而不管搜索器的预热程度如何。
如果该值为true,则意味着比赛马上就要开始,不管运动员是否已经热身充分了。

maxWarmingSearchers
该元素允许开发者控制后台并发预热的搜索器的最大数目。
一旦达到阈值,新的提交请求将会失败。
这是一种很好的保障机制,因为后台并预热的搜索器过多时会快速耗尽服务器的内存和CPU资源。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值