安装之前准备4台机器:bluejoe0,bluejoe4,bluejoe5,bluejoe9
bluejoe0作为master,bluejoe4,5,9作为slave
bluejoe0作为namenode
bluejoe9为secondary namenode
bluejoe4,5,9作为datanode
安装hadoop
首先在bluejoe0机器上下载hadoop:
wget http://mirrors.cnnic.cn/apache/hadoop/common/stable2/hadoop-2.5.2.tar.gz
保存至/usr/local/,tar之;
ln之,/usr/local/hadoop;
配置hdfs
配置core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://bluejoe0:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hdfs/tmp</value>
</property>
</configuration>
配置hdfs-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.name.dir</name>
<value>file:/data/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:/data/hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>bluejoe0:9000</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>bluejoe9:50090</value>
</property>
</configuration>
注意,dfs.namenode.rpc-address要和fs.default.name一致。
设置/usr/local/hadoop/etc/hadoop/hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
scp,将hadoop目录复制到其它机器;
配置slaves:
bluejoe4
bluejoe5
bluejoe9
namenode格式化:
hdfs namenode -format
启动hdfs:
./sbin/start-dfs.sh
可以看到输出信息:
Starting namenodes on [bluejoe0]
bluejoe0: starting namenode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-namenode-bluejoe0.out
bluejoe9: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe9.out
bluejoe4: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe4.out
bluejoe5: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe5.out
Starting secondary namenodes [bluejoe9]
接下来,可以查看Web界面(http://bluejoe0:50070/),其datanodes截图如下:
目前为止,hdfs安装完毕!
配置mapreduce
修改 yarn-site.xml:
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>bluejoe0:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>bluejoe0:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>bluejoe0:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>bluejoe0:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>bluejoe0:8088</value>
</property>
</configuration>
scp如上配置文件至其它节点;
启动mapreduce框架:
/usr/local/hadoop-2.5.2/sbin/start-yarn.sh
启动浏览器,访问http://bluejoe0:8088:
执行测试程序:
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar pi 100 1000
Job Finished in 12.885 seconds
Estimated value of Pi is 3.14120000000000000000
设置/usr/local/hadoop/etc/hadoop/hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64