Hadoop伪分布模式(HDFS)

本文详细介绍了Hadoop伪分布式部署的步骤,包括用户创建、SSH免密配置、环境变量设置及HDFS服务启动等关键环节,并演示了如何通过命令进行基本操作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

                   文档地址: http://hadoop.apache.org/docs/r2.8.2/
 部署方式:
1.单机模式standalone   1个java进程
2.伪分布模式Pseudo-Distributed Mode  开发|学习  多个java进程
3.集群模式Cluster Mode   :生产 多台机器多个java进程

伪分布式部署: HDFS
1.创建hadoop服务的一个用户
[root@hadoop02 software]# useradd hadoop
[root@hadoop02 software]# id hadoop
uid=515(hadoop) gid=515(hadoop) groups=515(hadoop)
[root@rzdatahadoop02 software]# 


[root@hadoop02 software]# vi /etc/sudoers
hadoop  ALL=(root)      NOPASSWD:ALL

2.部署JAVA
Oracle jdk1.8(Open JDK尽量不要使用)
[root@hadoop02 jdk1.8.0_45]# which java
/usr/java/jdk1.8.0_45/bin/java
[root@hadoop02 jdk1.8.0_45]#

3.部署ssh服务是运行
[root@hadoop02 ~]# service sshd status
openssh-daemon (pid  1386) is running...
[root@hadoop02 ~]# 



4.解压hadoop
[root@hadoop02 software]# tar -xzvf hadoop-2.8.1.tar.gz
chown -R hadoop:hadoop 文件夹 -->文件夹和文件夹的里面的 
chown -R hadoop:hadoop 软连接文件夹 --> 只修改软连接文件夹,不会修改文件夹里面的
chown -R hadoop:hadoop 软连接文件夹/* --> 软连接文件夹不修改,只会修改文件夹里面的
chown -R hadoop:hadoop hadoop-2.8.1 --> 修改原文件夹
[root@hadoop02 software]# ln -s  hadoop-2.8.1 hadoop

[root@hadoop02 software]# cd hadoop
[root@hadoop02 hadoop]# rm -f *.txt
[root@hadoop02 hadoop]# ll
total 28
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 bin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 etc
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 include
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 lib
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 libexec
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 sbin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 share
[root@hadoop02 hadoop]# 


bin: 命令
etc:配置文件
sbin: 用来启动关闭hadoop进程

5.切换hadoop用户和配置
[root@hadoop02 hadoop]# su - hadoop
[hadoop@hadoop02 ~]$ ll
total 0
[hadoop@hadoop02 ~]$ cd /opt/software/hadoop
[hadoop@hadoop02 hadoop]$ ll
total 28
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 bin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 etc
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 include
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 lib
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 libexec
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 sbin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 share
[hadoop@hadoop02 hadoop]$ cd etc/hadoop


hadoop-env.sh : hadoop配置环境
core-site.xml : hadoop 核心配置文件
hdfs-site.xml : hdfs服务的 --> 会起进程
[mapred-site.xml : mapred计算所需要的配置文件] 只当在jar计算时才有
yarn-site.xml : yarn服务的 --> 会起进程
slaves: 集群的机器名称


[hadoop@hadoop02 hadoop]$ vi core-site.xml 
<configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>
    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
    </property>
</configuration>


[hadoop@hadoop02 hadoop]$ vi hdfs-site.xml 
<configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

6.配置hadoop用户的ssh的信任关系
[hadoop@hadoop02 ~]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
5b:07:ff:e5:82:85:f3:41:32:f3:80:05:c9:57:0f:e9 hadoop@rzdatahadoop002
The key's randomart image is:
+--[ RSA 2048]----+
|         ..o..o. |
|          oo. .o |
|          o.=.. .|
|           o OE  |
|        S . = + .|
|         o . * + |
|        .   . + .|
|               . |
|                 |
+-----------------+

[hadoop@hadoop02 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop02 ~]$ chmod 0600 ~/.ssh/authorized_keys

7.格式化
[hadoop@hadoop002 hadoop]$ bin/hdfs namenode -format
17/12/13 22:22:04 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
17/12/13 22:22:04 INFO namenode.FSImageFormatProtobuf: Saving image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
17/12/13 22:22:04 INFO namenode.FSImageFormatProtobuf: Image file /tmp/hadoop-hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
17/12/13 22:22:04 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/12/13 22:22:04 INFO util.ExitUtil: Exiting with status 0
17/12/13 22:22:04 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at rzdatahadoop002/192.168.137.201
************************************************************/


Storage directory: /tmp/hadoop-hadoop/dfs/name 
1.默认的存储路径哪个配置?
2.hadoop-hadoop指的什么意思?
core-site.xml
hadoop.tmp.dir: /tmp/hadoop-${user.name}
hdfs-site.xml
dfs.namenode.name.dir : file://${hadoop.tmp.dir}/dfs/name


8.启动HDFS服务
[hadoop@hadoop02 sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 9a:ea:f5:06:bf:de:ca:82:66:51:81:fe:bf:8a:62:36.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: Error: JAVA_HOME is not set and could not be found.
localhost: Error: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 9a:ea:f5:06:bf:de:ca:82:66:51:81:fe:bf:8a:62:36.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: Error: JAVA_HOME is not set and could not be found.
[hadoop@hadoop02 sbin]$ ps -ef|grep hadoop
root     11292 11085  0 21:59 pts/1    00:00:00 su - hadoop
hadoop   11293 11292  0 21:59 pts/1    00:00:00 -bash
hadoop   11822 11293  0 22:34 pts/1    00:00:00 ps -ef
hadoop   11823 11293  0 22:34 pts/1    00:00:00 grep hadoop
[hadoop@rzdatahadoop002 sbin]$ echo $JAVA_HOME
/usr/java/jdk1.8.0_45
发现JAVA_HOME变量是存在的,无法启动HDFS服务




[hadoop@hadoop02 sbin]$ vi ../etc/hadoop/hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_45


[hadoop@hadoop02 sbin]$ ./start-dfs.sh 
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-rzdatahadoop002.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-rzdatahadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-rzdatahadoop002.out


namenode(名称节点) : localhost  
datanode(数据节点) : localhost
secondary namenode(第二名称节点): 0.0.0.0


http://localhost:50070/
默认的端口:50070

9.使用命令(hadoop、hdfs)
[hadoop@hadoop02 bin]$ ./hdfs dfs -mkdir /user
[hadoop@hadoop02 bin]$ ./hdfs dfs -mkdir /user/hadoop


[hadoop@hadoop02 bin]$ echo "123456" > rz.log
[hadoop@hadoop02 bin]$ ./hadoop fs -put rz.log hdfs://localhost:9000/
[hadoop@hadoop02 bin]$ 
[hadoop@hadoop02 bin]$ ./hadoop fs -ls hdfs://localhost:9000/
Found 2 items
-rw-r--r--   1 hadoop supergroup          7 2017-12-13 22:56 hdfs://localhost:9000/rz.log
drwxr-xr-x   - hadoop supergroup          0 2017-12-13 22:55 hdfs://localhost:9000/user


[hadoop@hadoop02 bin]$ ./hadoop fs -ls /
Found 2 items
-rw-r--r--   1 hadoop supergroup          7 2017-12-13 22:56 hdfs://localhost:9000/rz.log
drwxr-xr-x   - hadoop supergroup          0 2017-12-13 22:55 hdfs://localhost:9000/user


10.想要修改hdfs://localhost:9000为hdfs://192.168.137.201:9000
[hadoop@hadoop02 bin]$ ../sbin/stop-dfs.sh 


[hadoop@hadoop02 bin]$ vi ../etc/hadoop/core-site.xml 
<configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>
    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://192.168.137.201:9000</value>
    </property>
</configuration>




[hadoop@hadoop02 bin]$ ./hdfs namenode -format
[hadoop@hadoop02 bin]$ ../sbin/start-dfs.sh 
Starting namenodes on [hadoop002]
rzdatahadoop002: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-rzdatahadoop002.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-rzdatahadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-rzdatahadoop002.out


[hadoop@hadoop02 bin]$ netstat -nlp|grep 9000
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.137.201:9000        0.0.0.0:*                   LISTEN      14974/java          
[hadoop@hadoop02 bin]$ 



11.修改HDFS的服务以hadoop02启动
namenode: hadoop02
datanode: localhost  
secondarynamenode: 0.0.0.0 


针对于datanode修改:
[hadoop@hadoop002 hadoop]$ vi slaves
hadoop02


针对于secondarynamenode修改:
[hadoop@hadoop02 hadoop]$ vi hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
            <name>dfs.replication</name>
            <value>1</value>
    </property>


        <property>
                 <name>dfs.namenode.secondary.http-address</name>
                 <value>rzdatahadoop002:50090</value>
        </property>
        <property>
                 <name>dfs.namenode.secondary.https-address</name>
                 <value>rzdatahadoop002:50091</value>
        </property>


"hdfs-site.xml" 35L, 1173C written 


[hadoop@hadoop02 hadoop]$ cd ../../sbin
[hadoop@hadoop02 sbin]$ ./stop-dfs.sh
[hadoop@hadoop02 sbin]$ ./start-dfs.sh 
Starting namenodes on [hadoop02]
hadoop02: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-rzdatahadoop002.out
hadoop02: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-rzdatahadoop002.out
Starting secondary namenodes [rzdatahadoop002]
hadoop02: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-rzdatahadoop002.out

补充: 
某个服务数据目录在A盘(500G),还剩10G。/a/dfs/data
添加B盘2T。
1.A盘:mv /a/dfs /b/
2.B盘:ln -s /b/dfs /a
3.检查(修改)A,B盘的文件夹的用户和用户组的权限








来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31496956/viewspace-2148956/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/31496956/viewspace-2148956/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值