说明:
hadoop版本:0.21.0
一个namenode和两个datanode
参考:
http://www.360doc.com/content/10/0727/10/2159920_41738746.shtml
http://yymmiinngg.iteye.com/blog/706699
http://wenku.baidu.com/view/b3a1f5d2240c844769eaeec3.html
一、修改namenode和两个datanode的主机名
1、修改/etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.118.129 hadoop-namenode.localdomain hadoop-namenode
192.168.118.139 hadoop-datanode1.localdomain hadoop-datanode1
192.168.118.149 hadoop-datanode2.localdomain hadoop-datanode2
2、修改/etc/sysconfig/network
HOSTNAME=hadoop-namenode
二、在namenode和两个datanode上添加hadoop用户
#useradd hadoop
#passwd hadoop
三、配置ssh无密码登录
1、配置hadoop-namenode能用用户hadoop无密码登录两个datanode
在hadoop-namenode上
#su hadoop
$ssh-keygen -t rsa
$cp /home/hadoop/.ssh/id_rsa.pub authorized_keys
把hadoop-namenode上产生的id_rsa.pub拷贝到两个datanode上:
$scp /home/hadoop/.ssh/id_rsa.pub hadoop@192.168.118.139:/home/hadoop/authorized_keys
进入两个datanode
$enter password:
如果/home/hadoop/.ssh目录下没有authorized_keys,把authorized_keys拷贝到/home/hadoop/.ssh目录下:
cp /home/hadoop/authorized_keys /home/hadoop/.ssh
如果/home/hadoop/.ssh目录下有authorized_keys,把authorized_keys追加到已有的authozied_keys
cat /home/hadoop/authorized_keys >> /home/hadoop/.ssh/authorized_keys
配置两个datanode上的sshd_config文件,打开/etc/ssh/sshd_config,配置如下:
RSAAuthentication yes
PubkeyAuthentication yes
UsePAM no
重启两个datanode上的sshd服务:
service sshd restart
在hadoop-namenode上测试,能否无密码登录两个datanode:
2、配置连个datanode能用用户hadoop无密码登录hadoop-namenode
在两个datanode上
#su hadoop
$ssh-keygen -t rsa
把两个datanode上产生的id_rsa.pub拷贝到两个datanode上:
$scp /home/hadoop/.ssh/id_rsa.pub hadoop@192.168.118.129:/home/hadoop/.ssh/192.168.118.139.authorized_keys
$scp /home/hadoop/.ssh/id_rsa.pub hadoop@192.168.118.129:/home/hadoop/.ssh/192.168.118.149.authorized_keys
在hadoop-namenode上,把两个datanode上传上来的authorized_keys追加到已有的authozied_keys
cat /home/hadoop/.ssh/192.168.118.139.authorized_keys >> /home/hadoop/.ssh/authorized_keys
cat /home/hadoop/.ssh/192.168.118.149.authorized_keys >> /home/hadoop/.ssh/authorized_keys
配置hadoop-namenode上的sshd_config文件,打开/etc/ssh/sshd_config,配置如下:
RSAAuthentication yes
PubkeyAuthentication yes
UsePAM no
重启hadoop-namenode上的sshd服务:
service sshd restart
在两个datanode上测试,能否无密码登录hadoop-namenode:
四、配置hadoop
1、在namenode和两个datanode上的/usr/local/下建立目录hadoop,把下载的hadoop-0.21.0.tar.gz解压到该目
录。
2、配置core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.118.129:9001</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
</configuration>
3、配置hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
4、配置mapred-site.xml:
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>192.168.118.129:8021</value>
</property>
<property>
<name>mapreduce.jobtracker.system.dir</name>
<value>/usr/local/hadoop/mapred/system</value>
</property>
<property>
<name>mapreduce.cluster.local.dir</name>
<value>/usr/local/hadoop/mapred/local</value>
</property>
</configuration>
5、配置masters:
192.168.118.129
6、配置slaves:
192.168.118.139
192.168.118.149
7、把配置好的的文件scp到两个datanode上。
$ scp core-site.xml hdfs-site.xml mapred-site.xml masters slaves hadoop@192.168.118.139:/usr/local/hadoop
$ scp core-site.xml hdfs-site.xml mapred-site.xml masters slaves hadoop@192.168.118.149:/usr/local/hadoop
8、设置namenode和两个datanode上的/usr/local/hadoop的权限
# chown -R hadoop:hadoop /usr/local/hadoop
9、启动hadoop
在namenode上用用户hadoop登录
# su hadoop
格式化
$hdfs namenode -format
启动
$ start-all.sh
五、运行示例程序wordcount
在namenode上
$ cd /home/hadoop
$ mkdir workspace
$ cd workspace
$ mkdir wordcount
$ cd wordcount
$ mkdir input
$ echo "Hello world Bye world">input/f1
$ echo "hello hadoop bye hadoop">input/f2
在hdfs上建立一个input目录
$ hadoop fs -mkdir /tmp/input
把f1和f2拷贝到hdfs上的input目录下
$ hadoop fs -put home/hadoop/workspace/wordcount/input/f1 /tmp/input
$ hadoop fs -put home/hadoop/workspace/wordcount/input/f2 /tmp/input
查看hdfs上有没有f1和f2
$ hadoop fs -ls /tmp/input
执行wordcount(确保hdfs上没有output目录)
cd 到目录 /usr/local/hadoop/hadoop-0.21.0
$ hadoop jar hadoop-mapred-examples-0.21.0.jar wordcount /tmp/input /output
查看运行结果
$ hadoop fs -cat /output/part-r-00000
Bye 1
Hello 1
bye 1
hadoop 2
hello 1
world 2