1,准备三台linux机器,配置好Java环境,最好JAVA_HOME的值都配成一样的。
2,在三台机器上分别解压Hadoop安装包,解压路径也需要一样。
vi /etc/profile
export JAVA_HOME=/opt/jdk1.7.0_55
export HADOOP_HOME=/usr/local/hadoop-2.5.1
export CLASSPATH=.:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin
source /etc/profile
2,给三台机器建Hadoop的用户
sudo addgroup hadoop
sudo adduser --ingroup hadoop hadoop
ls -l /etc/sudoers
chmod u+w /etc/sudoers
add ---> hadoop ALL=(ALL) ALL
chmod u-w /etc/sudoers
sudo chown -R hadoop:hadoop hadoop2.5.1
3,让三台机器可以在Hadoop用户下免密SSH互通
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
4, vi /etc/hosts 添加三台机器的Host Name和IP
192.168.0.2 master
192.168.0.3 slave1
192.168.0.4 slave2
5,同步三台机器的时间
crontab -e */5 * * * * /usr/sbin/ntpdate ntp.api.bz &> /dev/null
6,修改Hadoop的配置文件
=======================================================
slave
192.168.0.3
192.168.0.4
==================================================
core-site.xml
<configuration>
<!-- file system properties -->
<property>
<name>fs.default.name</name>
<value>hdfs://123.57.46.179:9000</value>
</property>
<!--- global properties -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.5.1/tmp</value>
</property>
</configuration>
====================================================
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
=====================================================
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>123.57.46.179:9001</value>
</property>
</configuration>
===============================
hadoop namenode -format
start-all.sh
./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/hadoop-2.5.1/logs/hadoop-hadoop-namenode-master.out
slave1: starting datanode, logging to /usr/local/hadoop-2.5.1/logs/hadoop-hadoop-datanode-slave1.out
slave2: starting datanode, logging to /usr/local/hadoop-2.5.1/logs/hadoop-hadoop-datanode-slave2.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.5.1/logs/yarn-hadoop-resourcemanager-master.out
slave1: starting nodemanager, logging to /usr/local/hadoop-2.5.1/logs/yarn-hadoop-nodemanager-slave1.out
slave2: starting nodemanager, logging to /usr/local/hadoop-2.5.1/logs/yarn-hadoop-nodemanager-slave2.out
验证启动如下:
JPS
hadoop dfsadmin -report
hadoop dfs -mkdir testIn
hadoop dfs -ls
hdfs://localhost:9000/user/hadoop/testIn
hadoop dfs -copyFromLocal /tmp/test.* tesIn
hadoop jar hadooptest.jar wordcount testIn testOut
本文详细介绍如何搭建一个由三台Linux机器组成的Hadoop集群,包括环境配置、用户设置、SSH免密登录、hosts文件配置、时间同步及核心配置文件的修改等步骤。
3303

被折叠的 条评论
为什么被折叠?



