Abstract:
The Hadoop Distributed File Sys-tem (HDFS) is an open source system currently being used in situations where massive amounts of data need to be processed. Based on experience with the largest deployment of HDFS, I provide an analysis of how the amount of RAM of a single namespace server correlates with the storage capacity of Hadoop clusters, outline the advantages of the single-node namespace server architecture for linear performance scaling, and establish practical limits of growth for this architecture. This study may be applicable to issues with other distributed file systems.

本文基于Hadoop分布式文件系统(HDFS)的最大部署实例,分析了单个名称服务器的RAM容量与Hadoop集群存储能力之间的关系,概述了单节点名称服务器架构在性能线性扩展方面的优势,并确立了该架构的增长实际限制。

被折叠的 条评论
为什么被折叠?



