1、下载网址:
http://git.apache.org/
2、必要软件Maven
这个需要注意,不要下载最新3.1.1,而是下载3.0.5,因为3.1.1存在一个bug会带来麻烦。
这也是红帽、IBM不采用最新版本的原因吧,号称stable其实存在重大bug。
http://jira.codehaus.org/browse/MSITE-683
我把Maven解压放在 /usr/local,以此为例,修改/etc/profile增加:
export M2_HOME=/usr/local/apache-maven-3.0.5
export PATH=$PATH:$M2_HOME/bin
3、必要软件ProtocolBuffer,下载 2.5.0
http://code.google.com/p/protobuf/
解压缩之后
./configure && make
很简单,没什么好说的
3、必要软件Findbugs,翻墙或者想其它办法吧
http://findbugs.sourceforge.net/
解压缩至/usr/local/,修改/etc/profile:
export FINDBUGS_HOME=/usr/local/findbugs-2.0.2
export PATH=$PATH:$FINDBUGS_HOME/bin
4、source /etc/profile(不懂问google)
5、必要软件cmake,似乎所有Linux发行版都有,安装一下即可
6、进入下载完毕的hadoop源代码目录:
mvn package -DskipTests -Pdist,native,docs -Dtar
maven我也不是很熟系,请阅读它的文档。
7、至此,可以在hadoop-dist/target里边找到编译完成的安装包,后续步骤与下载binary安装没什么两样。
一、依赖软件安装
安装openjdk和openssh-server
sudo aptitude update sudo aptitude install openjdk-7-jdk cd /usr/lib/jvm/ sudo ln -s java-7-openjdk-amd64/ jdk
sudo aptitude install openssh-server
二、添加hadoop组以及用户
sudo addgroup hadoop sudo adduser --ingroup hadoop hduser sudo adduser hduser sudo
创建hduser后,使用hduser重新登录
三、设置SSH证书
ssh-keygen -t rsa -P '' ... Your identification has been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub. ...
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ssh localhost
最后一行ssh未报错(即ssh登录成功),则表明SSH配置成功。
四、下载Hadoop 2.2.0
cd ~ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz tar zxf hadoop-2.2.0.tar.gz sudo mv hadoop-2.2.0 /usr/local cd /usr/local sudo mv hadoop-2.2.0 hadoop sudo chown -R hduser:hadoop hadoop
五、配置hduser用户的环境变量
vi ~/.bashrc
在文件末尾添加下面内容
# Hadoop variables start export JAVA_HOME=/usr/lib/jvm/jdk/ export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL # Hadoop variables end
source ~/.bashrc cd /usr/local/hadoop/etc/hadoop vi hadoop-env.sh
修改JAVA_HOME一行为:
export JAVA_HOME=/usr/lib/jvm/jdk/
运行hadoop version,检查运行的Hadoop版本
hadoop version Hadoop 2.2.0 Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768 Compiled by hortonmu on 2013-10-07T06:28Z Compiled with protoc 2.5.0 From source with checksum 79e53ce7994d1628b240f09af91e1af4 This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.2.0.jar
六、配置Hadoop
cd /usr/local/hadoop/etc/hadoop vi core-site.xml
在<configuration>段添加下面内容:
<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property>
vi yarn-site.xml
在<configuration>段添加下面内容:
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>
mv mapred-site.xml.template mapred-site.xml vi mapred-site.xml
在<configuration>段添加下面内容
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
cd ~ mkdir -p mydata/hdfs/namenode mkdir -p mydata/hdfs/datanode cd /usr/local/hadoop/etc/hadoop vi hdfs-site.xml
在<configuration>段添加下面内容
<property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/mydata/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hduser/mydata/hdfs/datanode</value> </property>
七、格式化Namenode
hdfs namenode -format
八、开启Hadoop服务
start-dfs.sh start-yarn.sh
如果上述配置正确,运行jps应该输出下述内容:
11898 NameNode 12020 DataNode 12178 SecondaryNameNode 12429 ResourceManager 17802 Jps 12530 NodeManager