ubuntu14.04安装hadoop2.6.0,
1.先创建hadoop用户:
查看创建的用户在/home/下面有了文件夹:
2.安装ssh:
3.切换到hadoop用户下,然后启动ssh服务,验证用密码登陆localhost
4.退出,配置ssh无密码登陆localhost
把根目录下的~/.ssh/id_rsa.pub 复制到~/.ssh/authorized_keys里面
更改.ssh和authorized_keys的权限,防止hadoop运行时访问被拒绝。
验证无密码登陆localhost
1.
把hadoop解压到/usr/local下:
- sudo tar -zxvf hadoop-2.6.0.tar.gz
- sudo mv hadoop-2.6.0 /usr/local/hadoop
- sudo chmod -R 775 /usr/local/hadoop
- sudo chown -R hadoop:hadoop /usr/local/hadoop //否则ssh会拒绝访问
修改bashrc的配置:sudo gedit ~/.bashrc
3.执行source ~/.bashrc使其有效
4.修改hadoop-env.sh的配置:sudo gedit /usr/local/hadoop/etc/hadoop/hadoop-env.sh
找到JAVA_HOME改为上面的值。
5.测试:
- 通过执行hadoop自带实例WordCount验证是否安装成功
- /usr/local/hadoop路径下创建input文件夹
- mkdir input
- cp README.txt input
-
在hadoop下执行以下命令:
- bin/hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.6.0-sources.jar
- org.apache.hadoop.examples.WordCount input output
伪分布式配置(以上是单机版):
sudo gedit /usr/local/hadoop/etc/hadoop/core-site.xml
sudo gedit /usr/local/hadoop/etc/hadoop/yarn-site.xml
- <property>
- <name>yarn.resourcemanager.hostname</name>
- <value>master</value>
- </property>
- <property>
- <description>The address of the applications manager interface in the RM.</description>
- <name>yarn.resourcemanager.address</name>
- <value>${yarn.resourcemanager.hostname}:8032</value>
- </property>
- <property>
- <description>The address of the scheduler interface.</description>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>${yarn.resourcemanager.hostname}:8030</value>
- </property>
- <property>
- <description>The http address of the RM web application.</description>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>${yarn.resourcemanager.hostname}:8088</value>
- </property>
- <property>
- <description>The https adddress of the RM web application.</description>
- <name>yarn.resourcemanager.webapp.https.address</name>
- <value>${yarn.resourcemanager.hostname}:8090</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>${yarn.resourcemanager.hostname}:8031</value>
- </property>
- <property>
- <description>The address of the RM admin interface.</description>
- <name>yarn.resourcemanager.admin.address</name>
- <value>${yarn.resourcemanager.hostname}:8033</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>master:10020</value>
- <description>MapReduce JobHistory Server IPC host:port</description>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>master:19888</value>
- <description>MapReduce JobHistory Server Web UI host:port</description>
- </property>
- <configuration>
- <property>
- <name>dfs.replication</name>
- <value>1</value>
- </property>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/usr/local/hadoop/dfs/name</value>
- </property>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>file:/usr/local/hadoop/dfs/data</value>
- </property>
- <property> //这个属性节点是为了防止后面eclopse存在拒绝读写设置的
- <name>dfs.permissions</name>
- <value>false</value>
- </property>
- </configuration>
sudo gedit /usr/local/hadoop/etc/hadoop/masters 添加:localhost
sudo gedit /usr/local/hadoop/etc/hadoop/slaves
添加:localhost
配置完成后,首先在 Hadoop 目录下创建所需的临时目录:(注意创建目录的时候一定不要用sudo)
- cd /usr/local/hadoop
- mkdir tmp dfs dfs/name dfs/data
6.接着初始化文件系统HDFS。
- bin/hdfs namenode -format //每次执行此命令要把dfs/data/文件清空
Exitting with status 0
表示成功,
Exitting with status 1:
则是出错。
开启hadoop:
- sbin/start-dfs.sh
- sbin/start-yarn.sh
开启Jobhistory
sbin/mr-jobhistory-daemon.sh start historyserver
运行例子:
1.先在hdfs上建个文件夹 bin/hdfs dfs -mkdir -p /user/hadoop/input
bin/hdfs dfs -mkdir -p /user/hadoop/output
上传文件:
查看文件上传成功:
运行例子:
命令如下:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /porrylee/input/ /output/wordcount3
查看结果:
至此hadoop伪分布式安装成功!
hadoop安装参考:http://blog.youkuaiyun.com/ggz631047367/article/details/42426391
wordcount伪分布式运行:http://www.linuxidc.com/Linux/2015-01/112029p2.htm