Hadoop环境搭建与配置

本文详细介绍如何在阿里云上部署Hadoop集群,包括环境配置、软件安装、SSH免密设置、Hadoop配置等步骤,并通过WordCount程序验证集群运行正常。

1.简介

Hadoop是一个能够让用户轻松架构和使用的开源分布式计算框架,一种可靠、高效、可伸缩的方式进行数据处理。本文主要目的是向大家展示如何在阿里云上部署Hadoop集群.

2. 环境配置介绍

Ubuntu 14.04 LTS(1 Masters and 4 Slaves)

Hadoop 2.6.4

Java 1.8.0

MobaXterm_Personal(ubuntu连接工具,sftp客户端)

所有软件可以从我的云盘中获取 链接: https://pan.baidu.com/s/1i5mTBEH 密码: xbf4

 

服务器分布

Name

Internal IP

EIP

External SSH

Master

192.168.77.4

139.224.10.176

50022

Slave1

192.168.77.1

 

50122

Slave2

192.168.77.2

 

50222

Slave3

192.168.77.3

 

50322

Slave4

192.168.77.5

 

50522

 

 

 

 

 

 

 

 

 

 

3.编辑配置文件

(1). 编辑主节点和子节点的hostname

on Master and each Slave, repeat following change:

sudo vi /etc/hostname

Master

Slave1

Slave2

Slave3

Slave4

sudo vi /etc/hosts

192.168.77.4 Master

192.168.77.1 Slave1

192.168.77.2 Slave2

192.168.77.3 Slave3

192.168.77.5 Slave4

(2). 安装Java8

Create java folder under opt: sudo mkdir /opt/java

Unzip the installer: sudo tar -xvf jdk-8u91-linux-x64.tar.gz

Edit /etc/profile: sudo vi /etc/profile

Make the java work: sudo source /etc/profile

Test if java works: java -version

 

(3). 在每个节点上安装SSH

Generate secret key using rsa method(in ~):

ssh-keygen -t rsa -P ""

Press enter and it will generate files in /home/hadoop/.ssh

 

Add id_rsa.pub to authorized_keys:cat .ssh/id_rsa.pub >> .ssh/authorized_keys

 

Generate secret key on each Slave:ssh-keygen -t rsa -P ""

Send authorized_keys of Master to each Slave:

scp ~/.ssh/authorized_keys  hadoop@slave1:~/.ssh/

scp ~/.ssh/authorized_keys  hadoop@slave2:~/.ssh/

scp ~/.ssh/authorized_keys  hadoop@slave3:~/.ssh/

scp ~/.ssh/authorized_keys  hadoop@slave4:~/.ssh/

Testing ssh trust: ssh hadoop@slave1

It works if no password enter needed anymore

 

(4).安装配置Hadoop 2.6.4

Install Hadoop

Create hadoop folder under opt: sudo mkdir /opt/hadoop

Unzip the installer: sudo tar -xvf hadoop-2.6.4.tar.gz

 

Configuring etc/hadoop/hadoop-env.sh:

sudo vi /opt/hadoop/hadoop-2.6.4/etc/hadoop/hadoop-env.sh

 Configuring etc/hadoop/core-site.xml:

sudo vi /opt/hadoop/hadoop-2.6.4/etc/hadoop/core-site.xml

Configuring etc/hadoop/mapred-site.xml(if it didn't exist, rename file mapred-site.xml.template):

sudo vi /opt/hadoop/hadoop-2.6.4/etc/hadoop/mapred-site.xml

Configuring etc/hadoop/hdfs-site.xml

sudo vi /opt/hadoop/hadoop-2.6.4/etc/hadoop/hdfs-site.xml

Add Slave namenode to slaves file:

sudo vi /opt/hadoop/hadoop-2.6.4/etc/hadoop/slaves

 

Send hadoop to each Slave:

scp -r /opt/hadoop hadoop@Slave1:/home/hadoop

scp -r /opt/hadoop hadoop@Slave2:/home/hadoop

scp -r /opt/hadoop hadoop@Slave3:/home/hadoop

scp -r /opt/hadoop hadoop@Slave4:/home/hadoop

In each Slave, move to same location with Master and change owner:

sudo mv -r /home/hadoop /opt/

 

(5). 在/etc/profile中添加hadoop环境变量

sudo vi /etc/profile

 

source /etc/profile

 

(6).利用wordcount程序测试环境是否搭建成功

cd /opt/hadoop/hadoop-2.6.4/bin

./hdfs namenode -format    # 格式化集群

cd /opt/hadoop/hadoop-2.6.4/sbin

./start-all.sh 

 

 

Check connection status in namenode:

cd /opt/hadoop/hadoop-2.6.4/bin

./hdfs dfsadmin -report

 

Create folder /input : hadoop dfs -mkdir /input

Send test file to hadoop

hadoop dfs -put people.txt /input/

 

Run workcount demo:

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /input /output

Check result

hadoop dfs -cat /output/part-r-00000

 

转载于:https://www.cnblogs.com/kinginme/p/7212117.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值