docker 安装hadoop,hive,spark,hbase

本文详细介绍如何使用Docker搭建包含Hadoop、HBase、Spark和Hive的大数据集群环境,涉及网络配置、软件安装、环境变量设置及各组件配置等关键步骤。
0:网络和主机规划
docker network create --subnet=172.18.0.0/16 mynetwork
主机规划
"172.18.0.30 master" 
"172.18.0.31 slave1" 
"172.18.0.32 slave2" 
   
1:安装基础环境


docker pull ubuntu:16.04
docker run -it  ubuntu:16.04 /bin/bash
apt-get  安装ssh服务  mysql   open-jdk 8
确定java HOME:
root@master:/# ls -lrt /usr/bin/java
lrwxrwxrwx 1 root root 22 Jun 23 08:28 /usr/bin/java -> /etc/alternatives/java
root@master:/# ls -lrt /etc/alternatives/java
lrwxrwxrwx 1 root root 46 Jun 23 08:28 /etc/alternatives/java -> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
root@master:/# 


JAVA home 为:/usr/lib/jvm/java-8-openjdk-amd64


2:下载大数据安装包
wget http://archive.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
wget http://archive.apache.org/dist/hive/hive-2.1.0/apache-hive-2.1.0-bin.tar.gz
wget http://archive.apache.org/dist/hbase/1.2.4/hbase-1.2.4-bin.tar.gz
wget  http://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz
wget http://downloads.lightbend.com/scala/2.12.1/scala-2.12.1.tgz


解压到 /opt/tools 目录下;
建立五个软连接
140  ln -s hbase-1.2.4 hbase
  141  ln -s hadoop-2.7.2 hadoop
  142  ln -s apache-hive-2.1.0-bin hive
  143  ln -s spark-2.1.0-bin-hadoop2.7 spark
  144  ln -s scala-2.12.1 scala
  
修改环境变量文件,最后一行增加:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_PREFIX=/opt/tools/hadoop
export HADOOP_COMMON_HOME=/opt/tools/hadoop
export HADOOP_HDFS_HOME=/opt/tools/hadoop
export HADOOP_MAPRED_HOME=/opt/tools/hadoop
export HADOOP_YARN_HOME=/opt/tools/hadoop
export HADOOP_CONF_DIR=/opt/tools/hadoop/etc/hadoop
export YARN_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export SCALA_HOME=/opt/tools/scala
export PATH=${SCALA_HOME}/bin:$PATH
export SPARK_HOME=/opt/tools/spark
export PATH="$SPARK_HOME/bin:$PATH"


export HIVE_HOME=/opt/tools/hive
export PATH=$PATH:$HIVE_HOME/bin


export HBASE_HOME=/opt/tools/hbase
export PATH=$PATH:$HBASE_HOME/bin


/etc/init.d/ssh start
/etc/init.d/mysql start
echo "172.18.0.30 master" >> /etc/hosts
echo "172.18.0.31 slave1" >> /etc/hosts
echo "172.18.0.32 slave2" >> /etc/hosts  
  
3:打基础大数据image
docker commit 8eb631a1a734 wx-bigdata-base


4:运行三个容器
docker run -i -t --name master -h master --net mynetwork  --ip 172.18.0.30    wx-bigdata-base   /bin/bash
docker run -i -t --name slave1 -h slave1 --net mynetwork  --ip 172.18.0.31    wx-bigdata-base   /bin/bash
docker run -i -t --name slave2 -h slave2 --net mynetwork  --ip 172.18.0.32    wx-bigdata-base   /bin/bash


打通三台机器的免ssh 登录。

方法:ssh-keygen -t rsa,一路会车,生成/root/.ssh/id_rsa.pub ;

三台机器做同样操作。

在三台机器上执行cat /root/.ssh/id_rsa.pub   ,把三个输出结果合并,写入到 /root/.ssh/authorized_keys 

最终每台机器的authorized_keys  文件为:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDAyxS5rhm5etpm1eOSdBfaVKmRQPMI2TgY3PsUMWe1qo1NQAdNkpObSVN4Gq4HHso7SXLccd5Crb64fGrqYX9+jBVk3uUSQoKn8eoFtmnBU5Zpq7mRvGkctsubMa/EOh7DsjUWplo//p9+txvB45cvjwr8GSeBVPoTSyzRggleuERVVhRzDSXdg/z892JNoHukhGUrhOhtBnVemIV0wUlEoWFiuLJmJBo6Gj1yV7xJ5LDtWJ41XgkosKlKbEp8bc+w0e6NYN5k/DzaDtwfVc6utGE/7/mFs4gpWGzY0wRqP89QRnmlOYGm32v1I8+oXNqAmxfPKiWQdZ89jgZUS5RB root@master
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCbHbwIO5zzzNBJX25rbIdUI0+fqA3YJIhcgqbY2cQxSfa1dK20Uy/JD3ZTlffajEJ20qrs3yDpzfRHP8E+0dyPET3CV7I7onzCy8eBOQSaYBqtWXiEvwzE8iOD4aJJ4ZA3G8dhE8jlSFphO62PoqblEpIfWgFS1WkLEmNMrqgyEUCwiwzxySs6StBQF1vQ4TT2rcG5+qXWOuKjeOjscekstA2DrYNBY8zOEP/kNF4tUPf7mf2uiMJCHg+keXP9b0aCDMvVqakMx4PJW36NYISQiKf6yvSt1RFTGY+SYMG2d4Ysx58iNTrk7ber2qwDBghgtcJhr2VvZbLC9xv2w4WN root@slave1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDg8FVvLhkeT1/xMA/fTbzk9k0cf+5AX514z9Pw8A78ofWDir65eMJBEqLTSX87ynTvtg2BEN4Ht+SlS7ZUrzW3wbUPZw9T045GbiFzSRdwzCAuyXUWAFa+pY3Pi4MJhL1zjwkfX8WzRlUM+a5PSJ+B3i/JnoKMUin0HmjQ1XxIwMeG66b7pxXRAs/9SVY7k+f0zACJzTBN3eD9tKEpujrJmjlOYLg4M17NssGNK9vE5nAkCCv86GCRixyS8FNAxh0a8GsezUjimT1XRWokw9FSZdDuAamVCREZ3j6LuveCx58XzoM8UQ6u4KtObeWOPbJCotxyKR5SdFEgsSjrOJYP root@slave2 


5:安装和运行
修改master 的hadoop hbase spark hive 的配置文件。
scp 拷贝hadoop hbase spark hive 配置文件到另外2台主机。
hadoop 格式化文件空间(hdfs namenode -format )。
在master 上分别启动hadoop,hbase,spark;检查另外2个机器上是否启动正常。

master 上 安装hive:schematool -dbType mysql -initSchema

hadoop 启动方式:

/opt/tools/hadoop/sbin# ./start-all.sh  

停止方式:

/opt/tools/hadoop/sbin# ./stop-all.sh  

hbase 启动方式:

/opt/tools/hbase/bin/start-hbase.sh

停止方式:

/opt/tools/hbase/bin/stop-hbase.sh

spark 启动方式:

/opt/tools/spark/sbin/start-all.sh

停止方式:

/opt/tools/spark/sbin/stop-all.sh


6:配置文件

6.1 hadoop

目录:/opt/tools/hadoop/etc/hadoop

core-site.xml 文件

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at


    http://www.apache.org/licenses/LICENSE-2.0


  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://master:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/data/hadoop/tmp</value>
  </property>

</configuration>

hdfs-site.xml 文件

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at


    http://www.apache.org/licenses/LICENSE-2.0


  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/data/hadoop/data</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/data/hadoop/name</value>
  </property>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>master:9001</value>
  </property>

</configuration>

mapred-site.xml 文件

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at


    http://www.apache.org/licenses/LICENSE-2.0


  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

</configuration>

yarn-site.xml 文件

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at


    http://www.apache.org/licenses/LICENSE-2.0


  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>


<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>


</configuration>


slaves文件:

slave1

slave2

hadoop-env.sh 文件:

修改java home为:export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

6.2 hbase

目录位置:/opt/tools/hbase/conf

hbase-env.sh 文件:

修改javahome:export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

hbase-site.xml文件:

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>hdfs://master:9000/hbase_db</value>


      </property>
      <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
      </property>
      <property>
        <name>hbase.zookeeper.quorum</name> 
        <value>master,slave1,slave2</value> 
      </property>


       <property>
          <name>hbase.zookeeper.property.dataDir</name>
          <value>/opt/tools/hbase/zookeeper</value>
       </property>

</configuration>

regionservers 文件:

slave1

slave2

6.3 spark

目录位置:spark-env.sh

文件开头位置增加:

export SCALA_HOME=/opt/tools/scala
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/opt/tools/hadoop/etc/hadoop


slaves文件:

master
slave1

slave2

6.4 hive

目录位置:/opt/tools/hive/conf

hive-env.sh 文件:

行首增加:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/opt/tools/hadoop
export HIVE_HOME=/opt/tools/hive

export HIVE_CONF_DIR=/opt/tools/hive/conf 

hive-site.xml文件:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at


       http://www.apache.org/licenses/LICENSE-2.0


   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
--><configuration>
        <property>
            <name>hive.exec.scratchdir</name>
            <value>/opt/tools/hive/tmp</value>
        </property>
        <property>
            <name>hive.metastore.warehouse.dir</name>
            <value>/opt/tools/hive/warehouse</value>
        </property>
        <property>
            <name>hive.querylog.location</name>
            <value>/opt/tools/hive/log</value>
        </property>


        <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
  </property>

</configuration>


评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值