SolrCloud/ZooKeeper优化

本文提供了SolrCloud的优化方案,包括CPU主频选择、ZooKeeper配置优化、Solr参数调整等内容。针对ZooKeeper,文章强调了避免不一致的服务器列表、正确放置事务日志及合理设置Java堆内存的重要性;对于Solr,建议通过调整maxBufferedDocs和mergeFactor参数来优化索引,并介绍了如何合理使用Optimize功能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

SolrCloud优化:

 

1:CPU主频

2:ZooKeeper的优化项: 参考:http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html

Things to Avoid

Here are some common problems you can avoid by configuring ZooKeeper correctly:

inconsistent lists of servers

The list of ZooKeeper servers used by the clients must match the list of ZooKeeper servers that each ZooKeeper server has. Things work okay if the client list is a subset of the real list, but things will really act strange if clients have a list of ZooKeeper servers that are in different ZooKeeper clusters. Also, the server lists in each Zookeeper server configuration file should be consistent with one another.

incorrect placement of transasction log

The most performance critical part of ZooKeeper is the transaction log. ZooKeeper syncs transactions to media before it returns a response. A dedicated transaction log device is key to consistent good performance. Putting the log on a busy device will adversely effect performance. If you only have one storage device, put trace files on NFS and increase the snapshotCount; it doesn't eliminate the problem, but it should mitigate it.

incorrect Java heap size

You should take special care to set your Java max heap size correctly. In particular, you should not create a situation in which ZooKeeper swaps to disk. The disk is death to ZooKeeper. Everything is ordered, so if processing one request swaps the disk, all other queued requests will probably do the same. the disk. DON'T SWAP.

Be conservative in your estimates: if you have 4G of RAM, do not set the Java max heap size to 6G or even 4G. For example, it is more likely you would use a 3G heap for a 4G machine, as the operating system and the cache also need memory. The best and only recommend practice for estimating the heap size your system needs is to run load tests, and then make sure you are well below the usage limit that would cause the system to swap.

 

 

3:

每指定个maxBufferedDocs 为一个 segment ,每指定个mergeFactor 为一个single index file,适当调整maxBufferedDocs 和 mergeFactor 参数以致优化

 

4:点击solr admin UI 中的 Optimize 按钮,会将 single index file 合成一个索引文件, Optimize 是一个I/O高密集形任务,且 solr数据频繁的更新也会导致 Optimize 后的索引使用不了多长时间就得重新 Optimize ;

 

5: 参考:http://www.solr.cc/blog/?p=788

1、数据更新频率:每天数据增量有多大,随时更新还是定时更新
2、数据总量:数据要保存多长时间
3、一致性要求:期望多长时间内看到更新的数据,最长允许多长时间延迟
4、数据特点:数据源包括哪些,平均单条记录大小
5、业务特点:有哪些排序要求,检索条件
6、资源复用:已有的硬件配置是怎样的,是否有升级计划

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值