正式开始hadoop了。激动ing,。。。
因为在自己的笔记本安装所以,配置所限,使用的是单节点模式,足够学习使用了.
集群规划:
首先,上传编译完成的hadoop包到节点,并解压。
安装nodepade++,可以通过ftp插件远程编辑linux系统文件
开始hadoop配置:
core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://node01:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/client/servers/hadoop-2.8.5/hadoopDatas/tempDatas</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>10080</value>
</property>
hdfs-site.xml
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node01:50090</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>node01:50070</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/namenodeDatas,file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/namenodeDatas2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/datanodeDatas,file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/datanodeDatas2</value>
</property>
<property>
<name>dfs.namenode.edits.dir</name>
<value>file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/nn/edits</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/snn/name</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>file:///opt/client/servers/hadoop-2.8.5/hadoopDatas/dfs/snn/edits</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
修改hadoop_env.sh的javahome路径为绝对路径
mapred-site.xml
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>node01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>node01:19888</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log.aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>20480</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
mapred-env.sh.
slaves
node01
node02
node03
将配置中的路径还没有建的补完
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/tempDatas
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/namenodeDatas
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/namenodeDatas2
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/datanodeDatas
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/datanodeDatas2
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/nn/edits
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/snn/name
mkdir -p /opt/client/servers/hadoop-2.8.5/hadoopDatas/dfs/snn/edits
第一台机器的配置完成。将文件发送给另外两台
cd /opt/client/servers
scp -r hadoop-2.8.5 node02:$PWD
scp -r hadoop-2.8.5 node03:$PWD
vim /etc/profile
export HADOOP_HOME=/opt/client/servers/hadoop-2.8.5
export PATH=:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATHsource /etc/profile
全部配置完成后,开始启动集群:
cd /opt/client/servers/hadoop-2.8.5/
bin/hdfs namenode -format
sbin/start-dfs.shsbin/start-yarn.sh
sbin/mr-jobhistory-daemon.sh start historyserver
完成后查看每个节点的进程,都按计划启动了
通过浏览器访问hdfs
文件管理页面
大功告成。