前言
线上运维的集群进程会遇到datanode节点down掉的问题,默认超过10min没有反应,namenode认为此节点为dead node,并开始执行replication,通过观察hdfs web页面发现replication的速度很慢;
还有一种场景,我们对重要数据做升副本操作时,例如:./bin/hadoop fs -setrep -R -w 3 /home/main/, 发现replication的速度很慢,但是集群负载并不是很高。此篇文章主要介绍如何提高replication速度。
提高replication速度
The rate of replication work is throttled by HDFS to not interfere with cluster traffic when failures happen during regular cluster load.
Some properties controlling this are dfs.namenode.replication.work.multiplier.per.iteration, dfs.namenode.replication.max-streams and dfs.namenode.replication.max-streams-hard-limit. The foremost controls the rate of work to be scheduled to a DN at every heartbeat that occurs, and the other two further limit the maximum parallel threaded network transfers done by a DataNode at a time. Some description of this is available at https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml</

本文探讨了HDFS中replication速度慢的问题,分析了关键配置参数如dfs.namenode.replication.work.multiplier.per.iteration等的作用,并提出通过调整这些参数来提高replication速度的方法。
最低0.47元/天 解锁文章
3653





