一、配置环境变量
1.Java环境
sudo vi /etc/profile
#set java Environment
export JAVA_HOME=/usr/local/jdk
export CLASSPATH=".:$JAVA_HOME/lib:$CLASSPATH"
export PATH="$JAVA_HOME/bin:$PATH"
source /etc/profile
2.域名主机名配置(三台以上电脑)
sudo vi /etc/hosts
127.0.0.1 localhost
172.16.10.213 master
172.16.10.214 slave1
172.16.10.215 slave2
sudo vi /etc/sysconfig/network
HOSTNAME=master
注意(上面为213电脑,其他两台电脑依次修改)
二、配置SSH无密码登录(SSH安装省略)
1. 一直回车生成密钥
[dev@master ~]$ ssh-keygen -t rsa
2. 复制公钥
[dev@slave1 ~]$ cd/home/hadoop/.ssh/[dev@slave1 .ssh]$ cat id_rsa.pub >> authorized_keys
[dev@slave1 .ssh]$ chmod 600 authorized_keys
3.登录两台创建.ssh目录 ,复制公钥
[dev@slave1 ~]$ mkdir /home/hadoop/.ssh #
[dev@slave2 ~]$ mkdir /home/hadoop/.ssh
[dev@master .ssh]$ scp id_rsa.pub dev@slave1:/home/dev/.ssh/
[dev@master .ssh]$ scp id_rsa.pub dev@slave2:/home/dev/.ssh/
4.开启RSA认证
vi /etc/ssh/sshd_config
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
[root@slave1-hadoop ~]# service sshd restart
5.测试
[dev@master ~]$ ssh slave1
[dev@master ~]$ ssh slave2
三、Hadoop安装
1、下载、解压(略)
2、解决JAVA_HOME找不到问题(请在需要调用的地方之前加入)
vi libexec/hadoop-config.sh
export JAVA_HOME=(/usr/local/jdk)
3、配置文件:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/dev/hadoop/data/hadoop-nn/master</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>
file:/home/dev/hadoop/data/hadoop-dn/slave1,
file:/home/dev/hadoop/data/hadoop-dn/slave2
</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
注:hdfs-site.xml 配置中, namenode配置, 后面部分可以去掉,datanode配置,前面可以去掉4.加入datanode节点
vi etc/hadoop/slaves
slave1
slave2
四、启动Hadoop
1.namenode存储目录需要格式化,datanode存储目录不需要格式化,启动时自动创建
./bin/hdfs namenode -format
./sbin/start-all.sh
然后运行jps即可看到全部运行进程
2.查看当前节点
./bin/hdfs dfsadmin -report
Datanodes available: 3 (3 total, 0 dead) // 3为当前节点数
五、测试运行
1.导入文件./bin/hadoop fs -mkdir /data
./bin/hadoop fs -put -f example/file1.txt example/file2.txt /data
2.运行 WordCount(java) 版本
./bin/hadoop jar ./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.3.0-sources.jar org.apache.hadoop.examples.WordCount /data /output
3.查看结果
./bin/hadoop fs -ls /output/
./bin/hadoop fs -cat /output/part-r-00000
六、其他常见命令
export HADOOP_ROOT_LOGGER=INFO,console //配置日志输出方式,便于调试错误./bin/hdfs dfsadmin -report //可以查看到现在集群上连接的节点
./bin/hdfs dfsadmin -refreshNodes //这样会强制重新加载配置
./bin/hdfs namenode -format //格式化
./bin/hadoop fs -ls /
./sbin/hadoop-daemon.sh --script hdfs start namenode
./sbin/hadoop-daemon.sh --script hdfs start datanode
./sbin/start-dfs.sh
./sbin/stop-dfs.sh
参考:
http://www.linuxidc.com/Linux/2014-03/97565.htm
http://blog.youkuaiyun.com/a775700879/article/details/20692259
http://www.tuicool.com/articles/Njya6f