0.环境简介
1. 准备
2.set-SSH
3.set-hadoop
4.run-hadoop
[root@centos1 hadoop]# bin/hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
12/07/31 02:13:49 INFO namenode.NameNode: STARTUP_MSG:
12/07/31 02:13:49 WARN common.Util: Path /usr/local/hadoop/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/07/31 02:13:49 WARN common.Util: Path /usr/local/hadoop/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/07/31 02:13:49 INFO namenode.FSNamesystem: defaultReplication = 3
12/07/31 02:13:49 INFO namenode.FSNamesystem: maxReplication = 512
12/07/31 02:13:49 INFO namenode.FSNamesystem: minReplication = 1
12/07/31 02:13:49 INFO namenode.FSNamesystem: maxReplicationStreams = 2
12/07/31 02:13:49 INFO namenode.FSNamesystem: shouldCheckForEnoughRack
12/07/31 02:13:49 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/07/31 02:13:49 INFO namenode.FSNamesystem: fsOwner=root
12/07/31 02:13:49 INFO namenode.FSNamesystem: supergroup=supergroup
12/07/31 02:13:49 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/07/31 02:13:49 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/07/31 02:13:50 INFO common.Storage: Image file of size 110 saved in 0 seconds.
12/07/31 02:13:50 INFO common.Storage: Storage directory /usr/local/hadoop/hdfs/name has been successfully formatted.
12/07/31 02:13:50 INFO namenode.NameNode: SHUTDOWN_MSG:
[root@centos1 hadoop-0.21.0]# bin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-mapred.sh
starting namenode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-namenode-centos1.out
192.168.77.90: starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-centos2.out
192.168.77.92: starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-centos4.out
192.168.77.91: starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-centos3.out
The authenticity of host '192.168.77.89 (192.168.77.89)' can't be established.
RSA key fingerprint is 2b:e9:15:76:32:35:6b:d5:c4:29:2c:40:6f:5b:30:25.
Are you sure you want to continue connecting (yes/no)? yes
192.168.77.89: Warning: Permanently added '192.168.77.89' (RSA) to the list of known hosts.
192.168.77.89: starting secondarynamenode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-secondarynamenode-centos1.out
starting jobtracker, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-jobtracker-centos1.out
192.168.77.90: starting tasktracker, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-tasktracker-centos2.out
192.168.77.92: starting tasktracker, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-tasktracker-centos4.out
192.168.77.91: starting tasktracker, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-tasktracker-centos3.out
[root@centos1 hadoop]# jps
1869 NameNode
2058 SecondaryNameNode
2154 JobTracker
2258 Jps
[root@centos2 hadoop]# jps
1806 DataNode
1933 Jps
1892 TaskTracker
centos3,centos4与centos2的jps运行任务一致!!!
此时在物理机中的浏览器中已经可以看到集群的大致情况了:
centos1 Hadoop Map/Reduce Administration:
http://http://192.168.77.89:50030/jobtracker.jsp
OR: http://centos1:50030/jobtracker.jsp
NameNode 'centos1:9000':
http://192.168.77.89:50070/dfshealth.jsp
OR: http://centos1:50070/dfshealth.jsp
查看集群中的文件:
[root@centos1 hadoop]# bin/hadoop fs -ls
12/08/01 10:08:56 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:08:56 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
准备文件:
[root@centos1 hadoop]# mkdir in
[root@centos1 hadoop]# cp conf/*xml
[root@centos1 hadoop]# ls in/
capacity-scheduler.xml
core-site.xml
上传文件:
[root@centos1 hadoop]# bin/hadoop fs -put in in
12/08/01 10:10:50 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:10:50 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
[root@centos1 hadoop]# bin/hadoop fs -ls
12/08/01 10:10:57 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:10:57 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
Found 1 items
drwxr-xr-x
计算文件中各个单词的频率:
[root@centos1 hadoop]# bin/hadoop jar hadoop-mapred-examples-0.21.0.jar wordcount in out
12/08/01 10:11:41 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:11:41 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
12/08/01 10:11:41 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/08/01 10:11:42 INFO input.FileInputFormat: Total input paths to process : 7
12/08/01 10:11:42 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
12/08/01 10:11:42 INFO mapreduce.JobSubmitter: number of splits:7
12/08/01 10:11:42 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
12/08/01 10:11:42 INFO mapreduce.Job: Running job: job_201208010835_0001
12/08/01 10:11:43 INFO mapreduce.Job:
12/08/01 10:11:53 INFO mapreduce.Job:
12/08/01 10:11:54 INFO mapreduce.Job:
12/08/01 10:11:59 INFO mapreduce.Job:
12/08/01 10:12:05 INFO mapreduce.Job:
12/08/01 10:12:07 INFO mapreduce.Job: Job complete: job_201208010835_0001
12/08/01 10:12:07 INFO mapreduce.Job: Counters: 33
查看结果:
[root@centos1 hadoop]# bin/hadoop fs -ls
12/08/01 10:13:07 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:13:07 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
Found 2 items
drwxr-xr-x
drwxr-xr-x
[root@centos1 hadoop]# bin/hadoop fs -cat out/*
12/08/01 10:13:16 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapp
12/08/01 10:13:16 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
"*"
"AS
"License");
"alice,bob
'*',
':'
'aclsEnabled'
后面一堆的运行结果,在此省略!
本文详细介绍了在CentOS系统中使用Xen配置虚拟机,搭建Hadoop完全分布式环境的过程,包括环境准备、SSH设置、Hadoop配置及运行流程,并通过WordCount示例验证集群功能。
752

被折叠的 条评论
为什么被折叠?



