centos6.5下部署hadoop2.4+zookeeper

本文详细介绍了一个包含5台虚拟机的Hadoop集群搭建过程,包括系统配置、SSH无密码登录配置、JDK安装、Hadoop环境变量配置、核心配置文件设置等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

5台虚拟机的集群结构:
  • 192.168.7.2 hadoop-server1 (NameNode)
  • 192.168.7.3 hadoop-server1 (NameNode)
  • 192.168.7.100 hadoop-server1 (DataNode)
  • 192.168.7.101 hadoop-server1 (DataNode)
  • 192.168.7.102 hadoop-server1 (DataNode)
1.首先VMware新建三个虚拟机并安装contos6.5系统,修改虚拟机网络连接模式为桥接模式(如下图)

这里写图片描述

2.修改IP
  • vi /etc/sysconfig/network-scripts/ifcfg-eth0
  • 添加以下内容
IPADDR=192.168.6.194
NETMASK=255.255.255.0
GATEWAY=192.168.6.1
  • 并修改BOOTPROTO=static

  • 执行service network restart 使修改的IP生效

  • 执行ifconfig 查看IP是否修改成功

3.关闭防火墙
  • 关闭防火墙:service iptables stop
  • 永久关闭防火墙:chkconfig iptables off
  • 查看防火墙状态:service iptables status (出现 Firewall is not running. 证明关闭成功)
4.修改主机名
  • vi /etc/sysconfig/network
    这里写图片描述
  • vi /etc/hosts
    这里写图片描述
5.安装JDK,配置环境变量
  • 上传:put c:/Users/Administrator/Desktop/linux/jdk-7u76-linux-x64.tar.gz
  • 解压:tar -zxvf jdk-7u76-linux-x64.tar.gz(建议 /user/lib下)
  • 配置环境变量:vi /etc/profile(添加如下内容)
export JAVA_HOME=/usr/lib/jdk1.7.0_76
export JRE_HOME=/usr/lib/jdk1.7.0_76/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
  • 使环境变量生效 source /etc/profile
  • 验证jdk是否安装成功:java -version
6.SSH无验证双向登陆配置
7.安装配置Zookeeper
8.安装hadoop,配置环境变量
  • tar -zxvf hadoop-2.4.0.tar.gz ——建议解压到hadoop用户目录下(/home/hadoop)
  • 配置环境变量:vi /etc/profile(添加如下内容)
export HADOOP_HOME=/home/hadoop/hadoop-2.4.0
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_PREFIX/lib/native
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export LD_LIBRARY_PATH=$HADOOP_PREFIX/lib/native
export PATH=$PATH:$HADOOP_HOME/bin
  • 使环境变量生效 source /etc/profile
  • 验证hadoop环境变量是否配置成功:hadoop version
9.配置hadoop
  • vi $HADOOP_HOME/etc/hadoop/slaves
hadoop-server3
hadoop-server4
hadoop-server5
  • vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh (添加 JAVA_HOME 环境变量)
export JAVA_HOME=/usr/lib/jdk1.7.0_76
  • vi $HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-server1:2181,hadoop-server2:2181,hadoop-server3:2181,hadoop-server4:2181,hadoop-server5:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.4.0/tmp</value>
<description>Abaseforothertemporarydirectories.</description>
</property>
</configuration>
  • vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop-server1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop-server2:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.mycluster.nn1</name>
<value>hadoop-server1:53310</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.mycluster.nn2</name>
<value>hadoop-server2:53310</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop-server1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop-server2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-server1:8485;hadoop-server2:8485;hadoop-server3:8485;hadoop-server4:8485;hadoop-server5:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/hadoop-2.4.0/journal</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoop-2.4.0/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoop-2.4.0/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.image.transfer.bandwidthPerSec</name>
<value>1048576</value>
</property>
</configuration>
  • vi $HADOOP_HOME/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-server1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-server1:19888</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
  • vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>60000</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rm-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop-server1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop-server2</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop-server1:2181,hadoop-server2:2181,hadoop-server3:2181,hadoop-server4:2181,hadoop-server5:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
10.启动Zookeeper、JournalNode、格式化Hadoop集群并启动
  • 启动Zookeeper(5台都需要启动)
[hadoop@hadoop-server1 ~]$ [hadoop@hadoop-server1 bin]$ ./zkServer.sh start
  • 然后在某一个namenode节点执行如下命令,创建命名空间
[hadoop@hadoop-server1 bin]$ hdfs zkfc -formatZK
  • 启动JournalNode进程(5台都需要启动)
[hadoop@hadoop-server1 sbin]$ ./hadoop-daemon.sh start journalnode
  • 格式化Hadoop集群并启动
[hadoop@hadoop-server1 bin]$ hdfs namenode -format mycluster
[hadoop@hadoop-server1 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
  • 在某个 namenode 执行 $HADOOP_HOME/sbin/start-all.sh 启动datanode及yarn相关进程。
  • 可以通过haadmin查看每个Service的角色状态
$HADOOP_HOME/bin/hdfs haadmin -getServiceState nn1
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值