HADOOP单节点安装

Single Node Setup

Purpose

Prerequisites:预先准备

Supported Platforms

Required Software

Required software for Linux and Windows include:

  1. JavaTM 1.6.x, preferably from Sun, must be installed.   从SUN官方获取JAVA1.6x以上版本,必须按照。
  2. ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons。                 SSH必须被安装,SSH必须运行,使用SSH远程管理HADOOP守护脚本。

Additional requirements for Windows include:

  1. Cygwin - Required for shell support in addition to the required software above.                                                                      windows平台需要安装Cygwin

Installing Software

If your cluster doesn't have the requisite software you will need to install it.如果你的节点没有必须的软件,你需要安装他们

For example on Ubuntu Linux:以Ubuntu linux为例

$ sudo apt-get install ssh 
$ sudo apt-get install rsync

On Windows, if you did not install the required software when you installed cygwin, start the cygwin installer and select the packages:

Download

Prepare to Start the Hadoop Cluster

Standalone Operation:独立操作

Pseudo-Distributed Operation:伪分布式操作

Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.

HADOOP能运行于单节点于伪分布模式,每个HADOOP的坚守进程运行于分开的JAVA进程。

Configuration:配置

Use the following: 

conf/core-site.xml:

<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://localhost:9000</value>
     </property>
</configuration>


conf/hdfs-site.xml:

<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
</configuration>


conf/mapred-site.xml:

<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

现在检查你是否可以在没有密码的时候SSH到本地主机
$ ssh localhost

If you cannot ssh to localhost without a passphrase, execute the following commands:

如果你不能SSH连接到本地主机,在没有密码的情况下,则执行下面的命令;
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Execution:执行

Format a new distributed-filesystem:

格式化一个新的分布式文件系统
$ bin/hadoop namenode -format

Start the hadoop daemons:

启动HADOOP的守护进程
$ bin/start-all.sh

The hadoop daemon log output is written to the ${HADOOP_LOG_DIR} directory (defaults to ${HADOOP_HOME}/logs).

HADOOP的守护进程日志输出被写到HADOOP_LOG_DIR目录,默认是HADOOP目录下的log目录。

Browse the web interface for the NameNode and the JobTracker; by default they are available at:

the NameNode and the JobTracker浏览器的接口,默认他们是

Copy the input files into the distributed filesystem:

拷贝input中的文件到分布式系统
$ bin/hadoop fs -put conf input

Run some of the examples provided:

运行一些提供的实例
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

Examine the output files:

检查文件输出

Copy the output files from the distributed filesystem to the local filesytem and examine them:

拷贝output文件夹中的文件从分布式文件系统到本地文件系统,检查他们:
$ bin/hadoop fs -get output output 
$ cat output/*

or

View the output files on the distributed filesystem:

查看分布式系统中的output文件夹中的文件
$ bin/hadoop fs -cat output/*

When you're done, stop the daemons with:

你可以使用下面的命令结束坚守进程:
$ bin/stop-all.sh

Fully-Distributed Operation

Java and JNI are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值