原地址:http://www.haoplay.top/tiezi/3_1_1.html
文章参考:http://www.linuxidc.com/Linux/2016-02/128149.htm
在vmware上安装多个ubuntu系统
登录ubuntu系统修改机器名称:
修改文件/etc/hostname里的值即可,修改成功后用hostname命令查看当前主机名是否设置成功(需要重启系统才能生效)。
我将两个linux系统命名为:master 和 Slave1 (系统名最后不要有‘_’,'/'等特殊字符)
修改文件/etc/hosts 完成DNS映射
两个linux系统的hosts文件都要设置
2.在linux上安装ssh (ssh配置文件在/etc/ssh/)
sudo apt install openssh-server
1)执行命令:ssh-keygen -t rsa 生成秘钥和公钥。默认保存在用户目录的.ssh文件夹下.如:/home/admin1/.ssh
2)将公钥id_rsa.pub追加到授权的key里面去 : cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
3)将公钥拷贝到远程linux上:ssh-copy-id 用户名@机器名(admin1@Slavel_hadoop)
(root@Slavel_hadoop此处用户为root需要修改相关文件,root用户默认不能远程访问。)
解释: vim /etc/ssh/sshd_config
找到
PermitRootLogin prohibit-password
改成如下:
PermitRootLogin yes
3.安装JDK
下载jdk路径:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
配置文件/etc/profile:
jdk环境变量:
export JAVA_HOME=/usr/java/jdk1.7.0_25/
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
4.安装hadoop
参考配置地址:http://blog.youkuaiyun.com/stark_summer/article/details/42424279
下载hadoop地址:http://hadoop.apache.org/releases.html
window下载完后可以用xftp 5软件将压缩包传到linux指定路径下。
1)解压xxx.tar.gz文件:tar -xzvf hadoop-2.6.5.tar.gz 生成 hadoop-2.6.5
2)在hadoop-2.6.5文件夹下创建:tmp、dfs/data、dfs/name 文件
3)主要配置文件有7个:都在/hadoop-2.6.5/etc/hadoop文件夹下
~/hadoop-2.6.5/hadoop/hadoop-env.sh
~/hadoop-2.6.5/hadoop/yarn-env.sh
~/hadoop-2.6.5/hadoop/slaves
~/hadoop-2.6.5/hadoop/core-site.xml
~/hadoop-2.6.5/hadoop/hdfs-site.xml
~/hadoop-2.6.5/hadoop/mapred-site.xml
~/hadoop-2.6.5/hadoop/yarn-site.xml
具体配置如下:
1 配置 hadoop-env.sh文件-->修改JAVA_HOME
export JAVA_HOME=/home/spark/opt/java/jdk1.6.0_37
2配置 yarn-env.sh 文件-->>修改JAVA_HOME
export JAVA_HOME=/home/spark/opt/java/jdk1.6.0_37
3配置slaves文件-->>增加slave节点
#此处最好为slave节点的ip地址:
192.168.178.132
4配置 core-site.xml文件-->>增加hadoop核心配置(hdfs文件端口是9000、file:~/hadoop-2.6.5/tmp、)
<configuration>
<property>
<name>fs.defaultFS</name>
#此处最好为ip地址(master ip地址 同下)
<value>hdfs://192.168.178.131:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/spark/opt/hadoop-2.6.0/tmp</value>
<description>Abasefor other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.spark.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.spark.groups</name>
<value>*</value>
</property>
</configuration>
4.5、配置 hdfs-site.xml 文件-->>增加hdfs配置信息(namenode、datanode端口和目录位置)
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.178.131:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/spark/opt/hadoop-2.6.0/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/spark/opt/hadoop-2.6.0/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
4.6、配置 mapred-site.xml 文件-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.178.131:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.178.131:19888</value>
</property>
</configuration>
4.7、配置 yarn-site.xml 文件-->>增加yarn功能
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.178.131:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.178.131:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.178.131:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>192.168.178.131:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.178.131:8088</value>
</property>
</configuration>
5、将配置好的hadoop文件copy到另一台slave机器上
[spark@slave1 opt]$ scp -r hadoop-2.6.5 spark@192.168.178.132:~/opt/
6.格式化namenode(每个linux节点都要执行):
./hadoop-2.6.5/bin/hdfs namenode -format
7.启动hdfs(在hadoop-2.6.5的sbin目录下):
start-dfs.sh
停止hdfs:
stop-dfs.sh
启动yarn:
start-yarn.sh
停止yarn:
stop-yarn.sh
查看hdfs:http://10.58.44.47:50070/
查看yarn:http://10.58.44.47:8088/
注意:
如果发现无法访问 请检查linux防火墙是否关闭:
查看防火墙状态:systemctl status firewalld
关闭防火墙; systemctl stop firewalld
systemctl start firewalld
出现错误:
nodemanager did not stop gracefully after 5 seconds: killing with kill -9
可能是防火请没有关闭。