hadoop 1.x 伪分布式搭建

本文提供了从安装JDK到配置Hadoop环境的完整指南,包括创建用户、配置主机名、下载并解压Hadoop安装包,以及配置核心配置文件和环境变量。最终,通过格式化命名节点并启动所有守护进程,实现了Hadoop的伪分布式安装。
下载hadoop1.0.4版本,和jdk1.6版本或更高版本:
1. 安装JDK,安装目录大家可以自定义,下面是我的安装目录:
  1. /usr/jdk1.6.0_22
复制代码
配置环境变量:
  1. [root@hadoop hadoop-1.0.4]# vi /etc/profile
复制代码
将环境变量添加到profile文件底部:
  1. export JAVA_HOME=/usr/jdk1.6.0_22
  2. export PATH=$PATH:$JAVA_HOME/bin
复制代码
激活环境变量:
  1. [root@hadoop hadoop-1.0.4]# source /etc/profile
复制代码
2. 安装hadoop
创建hadoop用户及其hadoop组。
配置IP对应域名在
  1. [root@hadoop hadoop-1.0.4]$ vi /etc/hosts
复制代码
在hosts文件底部中添加如下:
  1. 192.168.0.101   hadoop.master
复制代码
接下来下载hadoop安装包,大家可以到apache官网下载1.0.4版本
我这里将hadoop安装到hadoop用户目录下,解压hadoop安装文件,hadoop有三个核心配置文件core-site.xml,hdfs-site.xml,mapred-site.xml,
配置core-site.xml
  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  3. <!-- Put site-specific property overrides in this file. -->

  4. <configuration>
  5. <property>
  6.   <name>fs.default.name</name>
  7.   <value>hdfs://hadoop.master:9000</value>
  8. </property>
  9. <property>
  10.   <name>hadoop.tmp.dir</name>
  11.   <value>/home/hadoop/tmp</value>
  12. </property>
  13. </configuration>
复制代码
配置hdfs-site.xml
  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  3. <!-- Put site-specific property overrides in this file. -->

  4. <configuration>
  5. <property>
  6.    <name>dfs.replication</name>
  7.    <value>1</value>
  8. </property>
  9. <property>
  10.   <name>dfs.name.dir</name>
  11.   <value>/home/hadoop/hdfs/name</value>
  12. </property>
  13. <property>
  14.   <name>dfs.data.dir</name>
  15.   <value>/home/hadoop/hdfs/data</value>
  16. </property>
  17. </configuration>
复制代码
配置 mapred-site.xml
  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  3. <!-- Put site-specific property overrides in this file. -->

  4. <configuration>
  5. <property>
  6.    <name>mapred.job.tracker</name>
  7.    <value>hadoop.master:9001</value>
  8. </property>
  9. </configuration>
复制代码
最后配置haoop-env.sh文件
  1. # Set Hadoop-specific environment variables here.

  2. # The only required environment variable is JAVA_HOME.  All others are
  3. # optional.  When running a distributed configuration it is best to
  4. # set JAVA_HOME in this file, so that it is correctly defined on
  5. # remote nodes.

  6. # The java implementation to use.  Required.
  7. export JAVA_HOME=/usr/jdk1.6.0_22

  8. # Extra Java CLASSPATH elements.  Optional.
  9.   export HADOOP_HOME=/home/hadoop/hadoop-1.0.4
  10. # export HADOOP_CLASSPATH=

  11. # The maximum amount of heap to use, in MB. Default is 1000.
  12. # export HADOOP_HEAPSIZE=2000

  13. # Extra Java runtime options.  Empty by default.
  14. # export HADOOP_OPTS=-server

  15. # Command specific options appended to HADOOP_OPTS when specified
  16. export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
  17. export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"
  18. export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS"
  19. export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS"
  20. export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"
  21. # export HADOOP_TASKTRACKER_OPTS=
  22. # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
  23. # export HADOOP_CLIENT_OPTS

  24. # Extra ssh options.  Empty by default.
  25. # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"

  26. # Where log files are stored.  $HADOOP_HOME/logs by default.
  27. # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs

  28. # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
  29. # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves

  30. # host:path where hadoop code should be rsync'd from.  Unset by default.
  31. # export HADOOP_MASTER=master:/home/$USER/src/hadoop

  32. # Seconds to sleep between slave commands.  Unset by default.  This
  33. # can be useful in large clusters, where, e.g., slave rsyncs can
  34. # otherwise arrive faster than the master can service them.
  35. # export HADOOP_SLAVE_SLEEP=0.1

  36. # The directory where pid files are stored. /tmp by default.
  37. # export HADOOP_PID_DIR=/var/hadoop/pids

  38. # A string representing this instance of hadoop. $USER by default.
  39. # export HADOOP_IDENT_STRING=$USER

  40. # The scheduling priority for daemon processes.  See 'man nice'.
  41. # export HADOOP_NICENESS=10

  42.   export PATH=$PATH:$HADOOP_HOME/bin
复制代码
最后在配置hadoop环境变量在跟jdk在同一个文件,在jdk环境变量下面增加hadoop环境变量:
  1. export HADOOP_HOME=/home/hadoop/hadoop-1.0.4
  2. export PATH=$PATH:$HADOOP_HOME/bin
复制代码
到这里所有环境配置已经完成:
再一次启动时需要namenode format,命令如下:
  1. [hadoop@hadoop bin]$ hadoop namenode -format
复制代码
格式完成后,启动hadoop,使用命令
  1. [hadoop@hadoop bin]$ ./start-all.sh
复制代码
使用jps查看hadoop启动的守护进程:
  1. [hadoop@hadoop bin]$ jps
  2. 2734 NameNode
  3. 2860 DataNode
  4. 2996 SecondaryNameNode
  5. 3090 JobTracker
  6. 3261 TaskTracker
复制代码
hadoop 伪分布式安装已经完成。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值