1. 安装环境
macOS High Sierra
版本 10.13.3
目标 安装Hadoop-3.1.0(伪分布)
2. 安装jdk
安装Java环境
并配置变量JAVA_HOME和PATH
我的Java version "1.8.0_162"
3. ssh设置
在终端输入命令ssh localhost
,可能遇到如下问题
ssh: connect to host localhost port 22: Connection
原因是没打开远程登录,进入系统设置->共享->远程登录打开就好,这时你再ssh localhost
一下
如果没有生成过ssh公钥,就使用命令: (查看 ~/.ssh/id_dsa 和~/.ssh/id_dsa.pub存不存在就知道之前有没有生成过公钥,或者直接 执行ssh localhost看能否成功)
没有生成过ssh密钥,执行:
$ ssh-keygen -t rsa -P ""
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
后面的命令是为了将自己的公钥存到对方的公钥保存文件夹中, 这是为了避免每次执行ssh localhost进行远程登入的时候都要输入密码。
4. Hadoop-3.1.0
下载hadoop-3.1.0.tar.gz
下载完之后,解压(我解压后的位置/Users/kang/hadoop-3.1.0)
5. 设置环境变量
终端输入vim ~/.bash_profile
这里会问你是否编辑,有个安全提示,按E即可编辑
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home
export CLASSPATH=$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:.
export PATH=$JAVA_HOME/bin:$PATH:.
export HADOOP_HOME=/Users/kang/hadoop-3.1.0 #Hadoop解压路径
export HADOOP_HOME_WARN_SUPPRESS=1 #防止出现Warning:$HADOOP_HOME_is_deprecated
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
上述环境变量增加完成后,退回到终端,输入
source ~/.bash_profile #使得环境变量生效
6. 配置Hadoop【五个】配置文件
进入etc/hadoop文件夹下
core-site.xml 指定了NameNode
的主机名与端口
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/kang/hadoop-3.1.0/tmp/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8000</value>
</property>
</configuration>
hadoop-env.sh
export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home"
hdfs-site.xml 指定了HDFS的默认参数副本数,因为仅运行在一个节点上,所以这里的副本数为1
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
指定了JobTracker的主机名与端口
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>2</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>2</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</configuration>
7. 启动Hadoop
至此在终端 输入 hadoop
Usage: hadoop [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
or hadoop [OPTIONS] CLASSNAME [CLASSNAME OPTIONS]
where CLASSNAME is a user-provided Java class
OPTIONS is none or any of:
--config dir Hadoop config directory
--debug turn on shell script debug mode
--help usage information
buildpaths attempt to add class files from build tree
hostnames list[,of,host,names] hosts to use in slave mode
hosts filename list of hosts to use in slave mode
loglevel level set the log4j level for this command
workers turn on worker mode
SUBCOMMAND is one of:
Admin Commands:
daemonlog get/set the log level for each daemon
Client Commands:
archive create a Hadoop archive
checknative check native Hadoop and compression libraries availability
classpath prints the class path needed to get the Hadoop jar and the
required libraries
conftest validate configuration XML files
credential interact with credential providers
distch distributed metadata changer
distcp copy file or directories recursively
dtutil operations related to delegation tokens
envvars display computed Hadoop environment variables
fs run a generic filesystem user client
gridmix submit a mix of synthetic job, modeling a profiled from
production load
jar <jar> run a jar file. NOTE: please use "yarn jar" to launch YARN
applications, not this command.
jnipath prints the java.library.path
kdiag Diagnose Kerberos Problems
kerbname show auth_to_local principal conversion
key manage keys via the KeyProvider
rumenfolder scale a rumen input trace
rumentrace convert logs into a rumen trace
s3guard manage metadata on S3
trace view and modify Hadoop tracing settings
version print the version
Daemon Commands:
kms run KMS, the Key Management Server
SUBCOMMAND may print help when invoked w/o parameters or with -h.
在程序执行前,对Namenode执行格式化操作hadoop namenode -format
,出现如下结果:
......此处省略
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ...
************************************************************/
表示HDFS已经安装成功。
执行start-all.sh启动
kangdeMacBook-Pro:~ kang$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as kang in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [kangdeMacBook-Pro.local]
2018-04-16 22:35:33,561 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
使用jps 查看进程
kangdeMacBook-Pro:~ kang$ jps
5792 Jps
5410 SecondaryNameNode
5604 ResourceManager
5705 NodeManager
5167 NameNode
浏览器输入网址 http://localhost:9870,就能看到Hadoop的界面
浏览器输入localhost:8088
终端输入 stop-all.sh,停止Hadoop服务