daily job tips(cassandra)-优快云博客

本文深入探讨了使用Cassandra进行多数据中心部署时的策略选择、硬件选择、集群维护、数据平衡等问题，并提供了针对不同场景的优化建议。重点介绍了如何通过合理设置分区策略、选择合适的硬件配置、实施定时备份任务以及维护集群系统配置文件等措施来确保集群稳定运行。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.多数据中心部署问题思考？
Token选择问题
分区策略选择
单数据中心内部数据平衡
多数据中心扩容
节点故障带来的问题
2.集群维护？
系统日志聚集(todo)
定时备份任务(todo)
集群系统配置文件维护方案选择

参考文档(cassandra wiki)
http://wiki.apache.org/cassandra/Operations#Replication

相关词汇：
consult 考虑,顾及
vital 致命的;生死攸关的
corollary 推论，必然的结果
disproportionately 不成比例不对称的
consequence 结果,后果
wipe 揩,擦;揩干;擦净

hypothetical 假设的,假定的;有待证实的
depition 描写;叙述
rigid 刚硬的; 僵硬的; 不弯曲的
symposium 讨论会, 专题报告
Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over.

1.云计算之cassandra硬件选择tip
Cloud
Several heavy users of Cassandra deploy in the cloud, e.g. CloudKick on Rackspace Cloud Servers and SimpleGeo on Amazon EC2.
On EC2, the best practice is to use L or XL instances with local storage. I/o performance is proportionately much worse on S and M sizes, and EBS is a bad fit for several reasons (see Erik Onnen's excellent explanation). Put the Cassandra commitlog on the root volume, and the data directory on the raid0'd ephemeral disks.

2.cassandra集群跨数据中心倍增扩容的说明
The corollary to this is, if you want to start with a single DC and add another later, when you add the second DC you should add as many nodes as you have in the first rather than adding a node or two at a time gradually.

3.关于repair频率问题
Frequency of nodetool repair
Unless your application performs no deletes, it is vital that production clusters run nodetool repair periodically on all nodes in the cluster. The hard requirement for repair frequency is the value used for GCGraceSeconds (see DistributedDeletes). Running nodetool repair often enough to guarantee that all nodes have performed a repair in a given period GCGraceSeconds long, ensures that deletes are not "forgotten" in the cluster.

4.//应该明确endpoint_snitch存在是为了让cassandra充分的了解网络拓扑以便有效的路由数据读请求。
//区分此选项partition(ByteOrderedPartitioner,OrderPreservingPartitioner,CollatingOrderPreservingPartitioner)策略和replication strategy(SimpleStrategy,OldNetworkTopologyStrategy,NetworkTopologyStrategy)
# endpoint_snitch -- Set this to a class that implements
# IEndpointSnitch, which will let Cassandra know enough
# about your network topology to route requests efficiently.

RackUnawareStrategy(OldNetworkTopologyStrategy): replicas are always placed on the next (in increasing Token order) N-1 nodes along the ring
RackAwareStrategy(NetworkTopologyStrategy): replica 2 is placed in the first node along the ring the belongs in another data center than the first; the remaining N-2 replicas, if any, are placed on the first nodes along the ring in the same rack as the first

5.关于一致性读取R的选择和读修复处理的时点说明
So what we'd like to do two changes:
only send read requests to the closest R live nodes
if read repair is enabled, also compare results from the other nodes in the background

[@more@]

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/23937368/viewspace-1054533/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/23937368/viewspace-1054533/