查看集群状态:
hadoop dfsadmin -report
显示块的信息,包括:备份数量,存在哪个节点,保存状态
hadoop fsck /user/ak47/tmp_0.gz -files -racks -blocks
检查HDFS状态:
hadoop fsck {path}
举例:
hadoop fsck /user
举例:
hadoop fsck /user
使用distcp在集群间进行文件复制(在dist机器执行)
hadoop distcp hdfs://{src_host_ip}:9000/log/src/FCACCESS/20110105/00 hdfs://{dist_hostname}:9000/log/src/FCACCESS/20110105/00
手动 关闭/打开 服务
hadoop-daemon.sh start/stop jobtracker/namenode/datanode
关闭安全模式:
hadoop dfsadmin -safemode leave
马上开始一个测试Mapreduce job
hadoop-0.20 fs -mkdir input
hadoop-0.20 fs -put /etc/hadoop-0.20/conf/*.xml input
hadoop-0.20 fs -ls input
hadoop-0.20 jar /usr/lib/hadoop-0.20/hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
hadoop-0.20 fs -put /etc/hadoop-0.20/conf/*.xml input
hadoop-0.20 fs -ls input
hadoop-0.20 jar /usr/lib/hadoop-0.20/hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
冒烟测试
测试HDFS性能
hadoop jar hadoop-test*.jar TestDFSIO -write -nrFiles 10 -fileSize 1000 # 测试write
hadoop jar hadoop-test*.jar TestDFSIO -read -nrFiles 100 -fileSize 100 # 测试read
hadoop jar hadoop-test*.jar TestDFSIO -clean # 清除生成数据
hadoop jar hadoop-test*.jar TestDFSIO -read -nrFiles 100 -fileSize 100 # 测试read
hadoop jar hadoop-test*.jar TestDFSIO -clean # 清除生成数据
排序测试
hadoop jar hadoop-examples-*.jar teragen 10000000 /user/hadoop/input_dir
hadoop jar hadoop-examples-*.jar terasort /user/hadoop/input_dir /user/hadoop/output_dir
#4台内存32GB、8core,1T*6磁盘、map slot 32个、reduce slot 16个,耗时37s
hadoop jar hadoop-examples-*.jar terasort /user/hadoop/input_dir /user/hadoop/output_dir
#4台内存32GB、8core,1T*6磁盘、map slot 32个、reduce slot 16个,耗时37s
求PI值
hadoop jar hadoop-examples-*.jar pi 10 100