下载hadoop-1.0.3
验证ssh
>which ssh
>which sshd
>which ssh-keygen
生成ssh密钥对
>ssh-kengen -t rsa
查看ssh公钥
>more /home/hadoopadmin/.ssh/id_rsa.pub (hadoopadmin是ubuntu用户名)
将公钥复制到主节点及各个从节点
>scp ~/.ssh/id_rsa.pub hadoopadmin@target:~/master_key (target是从节点的IP)
手动在从节点添加authorized_keys
~>mkdir .ssh
~>mv master_key .ssh/authorized_keys
这样,就把master机器上生成的ssh公钥部署到了slave上,再进行ssh slave就不会提醒输入密码了。
报错:Agent admitted failure to sign up the key.
方法:手动将私钥加进来,
http://blog.youkuaiyun.com/jiangsq12345/article/details/6187144
~>ssh-add .ssh/id_rsa
启动hadoop
修改conf/hadoop-env.sh:
export JAVA_HOME=/work/env/jdk1.6.0_26
修改ssh localhost:
~>ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
~>cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
hadoop主要配置文件:
core-site.xml
hdfs-site.xml
mapred-site.xml
这样,ssh localhost的时候就不用再输入密码了。
(1) 单机模式
(2) 伪分布模式
修改hadoop配置文件
core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
>cd $HADOOP_HOME
>bin/hadoop namenode -format
>bin/start-all.sh
会看到在本机上启动namenode、secondarynamenode、jobtracker,通过ssh在localhost上启动datanode、traktracker。
查看hadoop状态:http://localhost:50070/dfshealth.jsp
(3) 全分布模式
类似伪分布模式
core-site.xml 定义namenode
mapred-site.xml 定义jobtracker
hdfs-site.xml 定义文件备份数量
masters 定义SecondaryNode
slaves 定义datanode和tasktracker
hadoop默认的工作目录是/tmp/$USER,当机器重启后,这个目录可能会被删掉。可以指定其工作目录,在conf/hdfs-site.xml中:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/work/hadoop/hadoop-1.0.3/workdir/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/work/hadoop/hadoop-1.0.3/workdir/datanode</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/work/hadoop/hadoop-1.0.3/workdir/tmp</value>
</property>
</configuration>