原文地址:http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common
64位windows安装hadoop没必要倒腾Cygwin,直接解压官网下载hadoop安装包到本地->最小化配置4个基本文件->执行1条启动命令->完事。一个前提是你的电脑上已经安装了jdk,设置了java环境变量。下面把这几步细化贴出来,以hadoop2.7.2为例
1、下载hadoop安装包就不细说了:http://hadoop.apache.org/->左边点Releases->点mirror site->点http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common->下载hadoop-2.7.2.tar.gz;
2、解压也不细说了:复制到D盘根目录直接解压,出来一个目录D:\hadoop-2.7.2,配置到环境变量HADOOP_HOME中,在PATH里加上%HADOOP_HOME%\bin;点击http://download.youkuaiyun.com/detail/wuxun1997/9841472下载相关工具类,直接解压后把文件丢到D:\hadoop-2.7.2\bin目录中去,将其中的hadoop.dll在c:/windows/System32下也丢一份;
3、去D:\hadoop-2.7.2\etc\hadoop找到下面4个文件并按如下最小配置粘贴上去:
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/hadoop/data/dfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/hadoop/data/dfs/datanode</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
D:\hadoop-2.7.2\bin>hadoop namenode -format
.
.
.
D:\hadoop-2.7.2\bin>cd ..\sbin
D:\hadoop-2.7.2\sbin>start-all.cmd
This script is Deprecated. Instead use start-dfs.cmd and start-yarn.cmd
starting yarn daemons
D:\hadoop-2.7.2\sbin>jps
4944 DataNode
5860 NodeManager
3532 Jps
7852 NameNode
7932 ResourceManager
D:\hadoop-2.7.2\sbin>
通过jps命令可以看到4个进程都拉起来了,到这里hadoop的安装启动已经完事了。接着我们可以用浏览器到localhost:8088看mapreduce任务,到localhost:50070->Utilites->Browse the file system看hdfs文件。如果重启hadoop无需再格式化namenode,只要stop-all.cmd再start-all.cmd就可以了。
上面拉起4个进程时会弹出4个窗口,我们可以看看这4个进程启动时都干了啥
1.创建输入目录
- 1
- 2
- 3
- 1
- 2
- 3
2.上传数据到目录
- 1
- 2
- 3
- 1
- 2
- 3
3.查看文件
D:\hadoop\hadoop\bin >hadoop fs -ls hdfs://localhost:9000/user/wcinput