Hadoop安装
安装SSH
安装OpenSSH首先要安装client和server,
命令为:
sudo apt-get install openssh-server
但是在安装过程可能会出现没有发现client的错误,如下:
openssh-server : Depends: openssh-client (= 1:6.6p1-2ubuntu1)
则使用命令安装client:
sudo apt-get install openssh-client=1:6.6p1-2ubuntu1
安装ssh:
sudo apt-get install ssh
配置无密码登陆
进入ssh文件夹(/home/××/.ssh),如果没有的话可以手动创建一个。
输入如下,然后一路回车:
ssh-keygen -t rsa
将id_rsa.pub追加到authorized_keys授权文件中:
cat id_rsa.pub >> authorized_keys
重启SSH使配置生效:
service ssh restart
验证是否成功:
ssh localhost ;exit
上一步ssh到本地应该不需要输入密码,如果仍要求输入密码,则修改.ssh权限:
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
关闭防火墙
查看防火墙状态,关闭防火墙:
sudo ufw status
sudo ufw disable(重启生效)
配置以root身份用secureCRT登录
vi /etc/ssh/sshd_config # 编辑此文档
cat /etc/ssh/sshd_config |grep PermitRootLogin #查看是否允许root登陆
三台机器分别设置host和主机名
/etc/hosts 和/etc/hostname
创建相应namenode和datanode目录:
mkdir -p ~/dfs/name
mkdir -p ~/dfs/data
mkdir -p ~/temp
解压Hadoop文件
配置环境变量:hadoop-2.2.0/etc/hadoop/hadoop-env.sh
修改export JAVA_HOME=/home/vmworker01/software/jdk1.7.0_79
配置yarn环境变量:hadoop-2.2.0/etc/hadoop/yarn-env.sh
修改export JAVA_HOME=/home/vmworker01/software/jdk1.7.0_79
配置hadoop-2.2.0/etc/hadoop/slaves(这里保存所有Slave节点)
slave01
slave02
配置:hadoop-2.2.0/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/vmworker01/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.vmworker01.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.vmworker01.groups</name>
<value>*</value>
</property>
</configuration>
~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/vmworker01/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/vmworker01/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
启动验证
格式化hdfs:
/bin/hdfs namenode -format#没有这个命令就算了
在hadoop文件夹下进入sbin目录,执行start-all.sh:
./start-all.sh
通过jps查看hadoop状态,这时,master节点上有:namenode、secondarynamenode、resourcemanager
slave节点上有:datanode nodemanaget