初探hadoop,hadoop大数据平台初次搭建经验_hadoop大数据平台搭建创建虚拟机实训心得-优快云博客

本文记录了作者初次搭建Hadoop大数据平台的过程，包括遇到的问题和解决方法，如主机名设置、Zookeeper集群配置、HDFS的安装与配置等。作者通过一周的努力，虽然遇到一些挑战，但对Hadoop的整体架构有了更深入的理解，同时学会了如何通过公私钥实现无密码登录。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言

初到新公司，一个在运维岗位工作十年的老前辈然后交代给我一个工作，给我了一个文档和四台虚拟机，然后让我搭建hadoop平台；我当时就感觉到压力山大，不过还好有文档，好吧硬着头皮接下来吧！然后就长达一周尝试搭建、搭建、排错、周而复始；终于咋今天可正常上传文件了，但是还存在着一定的问题，但是还是先记录下来吧！

大致架构描述：

问题列举

主要是我在尝试安装过程中遇到的各种常用问题和犯得一些错误，这里写下来引以为戒，如果下次搭建，一定要留意这些问题。

1：主机名问题，第一次搭建的时候主机名我只是简单写了ip-->域名，并没有写网络识别的主机名，对后面的搭建是造成了一定的麻烦，第二次写的如下

[bigdata@namenode01 ~]$  cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.22.3.110 namenode1-sit.qq.com namenode01
10.22.3.111 namenode2-sit.qq.com namenode02
10.22.4.110 datanode1-sit.qq.com datanode01
10.22.4.111 datanode2-sit.qq.com datanode02

2：权限问题，在上传软件包的时候，并没有注意用户的权限一类的东西，导致权限，几乎全部乱掉。整个集群四台机器上上面，这么乱的权限，就别想能把服务启起来，所以谨记权限问题，在谁的家目录，就是谁的own权限。

3：针对zookeeper三集群设置，最后设置myid的一定要设置正确，如果数字一不小心写成一样的，这就尴尬了，因为这个zookeeper的配置文件信息少，基本上一目十行了，检查的时候，会十分麻烦。

4：针对自定义的那个字段，最好在hdfs-site.xml 多排查几遍，因为这个配置文件里面有很多处出现了自定义字段，所以需要多多注意。

5：对于多次格式化导致 cid 不对的，修改下cid ，然后重启就行再那边找cid ，一般去hdfs的配置文件里面有。current/VERSION

6：blockpool id 不匹配的，我真的不知道该怎么处理，然后datanode日志上面一直在报错，让我很蛋疼；以后提高以后在处理这个问题吧！

收获

虽然忙碌了一星期（我承认大部分时间都是在玩），但是还是有些收获的，前辈告诉我，在无密码登陆时，一对公私密钥，可以在很多用在多个账户下实现无密码登录，然后我就查了一些资料，然后生成一对公私钥，然后别分上传到每个用户的~/.ssh文件夹下，（注意这些文件的权限，600 ，否者就不能用），作为登录机，要保留私钥，被登录机上传公钥到然后改名成authorized_keys，如果存在这个文件，就cat id_rsa.pub >authorized_keys，这样排错也好排错，管理起来也方便很多。然后最大的收获就是对hadoop的整体架构有了了解。好了不说那么多废话了，下面开始真刀实枪的干一场。

搭建步骤

1.1 首先更新各个节点上的hosts文件，如开头所示，

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.22.3.110 namenode1-sit.qq.com namenode01
10.22.3.111 namenode2-sit.qq.com namenode02
10.22.4.110 datanode1-sit.qq.com datanode01
10.22.4.111 datanode2-sit.qq.com datanode02

1.2 我看很多文档都修改了这个编码，将英文编码改成了中文编码，（包括前辈给的资料，但是我感觉没有什么用，所以就没有修改，）修改字符集将“en_US.UTF-8”改为"en_zh.UTF-8"，反正我没改剩下的你们看

cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SYSFONT="latarcyrheb-sun16"

1.3 关闭防火墙和selinux, 这些原因，我就不多说了，都知道原因

vi /etc/selinux/config   
修改：
SELINUX=disabled

chkconfig iptables off
service iptables stop
service iptables status

1.4 关闭abrtd 这个服务说实话不是太清楚是搞什么用的，但是还是关掉为好

chkconfig abrtd off
service abrtd stop
service abrtd status

1.6 关闭coredump 首先检测下这个这个是否打开，如果没有打开就不用理睬

ulimit -c  # 如果是输出为0 就表明这个coredump没有开启，

ulimit -S -c 0 #关闭

1.5 对四台机器放开资源，其实作为测试环境，不会有那么大的流量的和资源占用的，这个其实完全可以忽略，但是生产环境就不能这样马虎了，所以还是照做吧！

1.5.1 修改https://cp.launchvps.com如下所示

[bigdata@datanode02 ~]$ cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4
#* soft nofile 10240
#* hard nofile 10240
#* soft nproc 11000
#* hard nproc 11000
# End of file
*       -    nproc    20480
*       -    nofile    32768

1.5.2 修改/etc/security/limits.d/90-nproc.conf 如下图所示

[bigdata@datanode02 ~]$ cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
*       -    nproc    20480
*       -    nofile    32768

1.6 修改系统内核参数/etc/sysctl.conf 如下

[bigdata@datanode02 ~]$ cat /etc/sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

net.ipv4.ip_local_reserved_ports = 2181,2888,3772-3773,3888,6627,7000,8000,8021,8030-8033,8088-8089,8360,9000,9010-9011,9090,9160,9999,10009,10101-10104,11469,21469,24464,50010,50020,50030,50060,50070,50075,50090,60000,60010,60020,60030
net.ipv4.ip_local_port_range = 10000 65000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 32768
vm.swappiness = 0
vm.overcommit_memory = 1

1.7 内存优化如下

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag' >> /etc/rc.local

1.8 必要的软件包安装

yum install gcc python perl php make   smartmontools  iotop -y

到此各个节点系统优化已经结束，（注意以上优化，每个节点都需要做，否则会有问题，）

2 搭建前准备

软件准备准备jdk1.7 oracle官网可以下载 zookeeper-3.4.9.tar.gz hadoop-2.7.3.tar.gz 这些都可以在网管下载。

2.1 创建用户

用户名	组名	所在机器	用途
zookeeper（501）	zookeeper（501）	zk1, zk2, zk3	ZK
bigdata（500）	bigdata（500）	all	HDFS（NN, DN, JN）
yarn（600）	yarn（600）	all	YARN（RM, NM）

对应命令，创建用户组，然后创建用户，

# groupadd bigdata -g 500 && useradd -g bigdata -u 500 bigdata

2.2 设置无密码登录， bigdata zookeeper yarn 等用户做无密码登录（核心需求，两个namenode可以相互登录，namenode可以登录其他datanode）这边使用一对公私钥密钥进行设置，如果具体的一个台机器设置会比较麻烦，然后就通过拷贝公私钥的方式做加密文档，

$ ssh-keygen #生成公私钥对，其后面全部回车
$ cd ~/.ssh/   #切换到存储公私钥的地方，查看生成的公私钥
$ ls
id_rsa  id_rsa.pub  known_hosts # id_rsa 为私钥 id_rsa.pub 为公钥，
$ cp -rp id_rsa.pub authorized_keys # 将公钥写成保管公钥的文件， 
$ ll 
total 16
-rw-r--r-- 1 bigdata bigdata  402 Apr 24 15:08 authorized_keys
-rw------- 1 bigdata bigdata 1675 Apr 24 14:30 id_rsa
-rw-r--r-- 1 bigdata bigdata  402 Apr 24 14:30 id_rsa.pub
-rw-r--r-- 1 bigdata bigdata 1997 Apr 24 16:07 known_hosts
ll -d
drwx------ 2 bigdata bigdata 4096 Apr 24 15:09 .  
$  cd ..
$ tar zcvf ss.tar.gz .ssh/  #讲此类文件打包 
$ scp ss.tar.gz  namenode02: #将包传到namenode02 的家目录

在namenode02 上解压打开 权限是否正确 
[bigdata@namenode02 opt]$ tar zxvf ss.tar.gz 
.ssh/
.ssh/id_rsa
.ssh/known_hosts
.ssh/id_rsa.pub
.ssh/authorized_keys
[bigdata@namenode02 opt]$ cd .ssh/
[bigdata@namenode02 .ssh]$ ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
[bigdata@namenode02 .ssh]$ ll