在Deepin(深度)操作系统上使用docker在idea中搭建一个简单的Hadoop集群(一)


在Deepin(深度)操作系统上使用docker在idea中搭建一个简单的Hadoop集群(一)

    **本文将使用详细的描述和代码来展示Deepin系统下,一个包含master(主节点)+两个slave(工作节点)的简单hadoop(完全分布式集群)的配置方法,
  以及一些基础的避坑指南,由于篇幅原因,后期部分配置文件详细内容会更换为文件超链接,所有文件均不需要任何下载积分,仅提供样例,也可直接使用** 
  
  第二部分传送门:[https://blog.youkuaiyun.com/welson650/article/details/105289339](https://blog.youkuaiyun.com/welson650/article/details/105289339)

一、下载及配置Docker

1.如果安装过老版本的docker,请先卸载老版本:

$ sudo apt-get remove docker docker-engine docker.io containerd runc

2.设置存储库
(1)更新apt包索引:

$ sudo apt-get update

(2)安装软件包以支持https协议:
主要针对下一步中的curl指令,安装curl时是不支持http加密协议的:

$ sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

(3)添加Docker的官方GPG密钥:

$ curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -

注:此处如仍报curl不支持https协议,本人在安装时遇到过,此时运行如下指令:

sudo apt-get install curl libcurl3 libcurl4-openssl-dev

待运这几个包安装完成后,此时curl已经可以支持https协议。
此时输入$ curl -V,可以看到当前curl指令支持的协议:在这里插入图片描述
然后重新运行$ curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -指令就可以获取密钥了。
此时可以通过$ sudo apt-key fingerprint 0EBFCD88来验证获取到的密钥:

welson@welson-PC:~/Desktop$ sudo apt-key fingerprint 0EBFCD88
[sudo] welson 的密码:
pub   rsa4096 2017-02-22 [SCEA]
      9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <docker@docker.com>
sub   rsa4096 2017-02-22 [S]

(4)安装DOCKER ENGINE-社区
i.再次更新apt包:

$ sudo apt-get update

ii.安装最新版本的Docker Engine-Community和containerd

$ sudo apt-get install docker-ce docker-ce-cli containerd.io

iii.此时输入docker如出现以下效果则证明安装成功:
在这里插入图片描述
(5)将自己创建的非root用户加入docker用户组:

$ sudo gpasswd -a username(这里替换成你自己本机的非root用户的用户名) docker

(6)重启docker服务:

$ sudo service docker restart

(7)待服务重启完成后,输入sudo docker ps -a进行验证,如显示如下则docker配置成功:

welson@welson-PC:~/Desktop$ sudo docker ps -a
[sudo] welson 的密码:
CONTAINER ID        IMAGE                 COMMAND                  CREATED             STATUS                           
PORTS                                            NAMES

二、下载及安装docker-compose

$ sudo curl -L --fail https://github.com/docker/compose/releases/download/1.25.4/run.sh -o /usr/local/bin/docker-compose
$ sudo chmod +x /usr/local/bin/docker-compose

输入docker-compose -v查看docker-compose版本号,如果与下载版本相对应,则安装成功。

三、安装docker可视化图形工具portainer

$ docker volume create portainer_data
$ docker run -d -p 8000:8000 -p 9000:9000 --name=portainer --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer

四、设置docker免sudo使用

$ sudo gpasswd -a ${USER} docker
$ sudo service docker restart
$ newgrp - docker

五、idea操作部分

1.随便新建一个空工程,并在与工程目录下新建一个目录,此处为day2
在这里插入图片描述
2.进入day2目录并新建hadoop目录:
(1)打开左下角的Terminal:
在这里插入图片描述
(2)看自己的目录如果已经在day2目录内,则输入如下指令创建hadoop文件夹并新建docker文件,如果未进入则进入day2目录:

$ mkdir hadoop
$ cd hadoop
$ touch Dockerfile

3.编写DockerFile此处提供样例:

FROM ubuntu:18.04
MAINTAINER author<author@email.com> //此处为作者信息根据个人情况进行修改,也可以直接删除




RUN apt-get update && apt-get install -y locales && rm -rf /var/lib/apt/lists/* \
    && localedef -i zh_CN -c -f UTF-8 -A /usr/share/locale/locale.alias zh_CN.UTF-8

ENV LANG zh_CN.UTF-8

RUN apt-get update -o Acquire-by-hash=yes && apt-get install -y openssh-server openjdk-8-jdk

ADD hadoop-2.7.5.tar.gz /usr/local/

ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
ENV HADOOP_HOME=/usr/local/hadoop-2.7.5
ENV PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
ENV HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

RUN ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' && \
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

RUN mkdir -p /data/hdfs/namenode && \
    mkdir -p /data/hdfs/datanode && \
    mkdir -p /data/logs

COPY config/* $HADOOP_CONF_DIR/

4.下载hadoop tar包并置于刚才新建的hadoop文件夹下。

5.用上面的新建指令或者直接右键在hadoop文件夹下新建config文件夹,并导入/编写配置文件(配置文件共9个,其余5个见下次更新):
(1)core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop-master:9000/</value>
    </property>
</configuration>

(2)hadoop-env.sh
hadoop-env.sh(0积分下载)
(3)hdfs-site.xml,此处的dfs.replication的值应与你的slave节点数目相对应。

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///data/hdfs/namenode</value>
        <description>NameNode directory for namespace and transaction logs storage.</description>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///data/hdfs/datanode</value>
        <description>DataNode directory</description>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
</configuration>

(4)log4j.properties
log4j.properties文件传送门(0积分下载)
为避免篇幅过长暂时写到这里,并在下次更新后附上下一部分的超链接。
https://blog.youkuaiyun.com/welson650/article/details/105289339

Title: Deep Learning with Hadoop Author: Dipayan Dev Length: 259 pages Edition: 1 Language: English Publisher: Packt Publishing Publication Date: 2017-03-06 ISBN-10: 1787124762 ISBN-13: 9781787124769 Book Description Deep Learning involves extracting features and insights from multiple layers of the data. This book will teach you how to deploy the deep learning networks with Hadoop. Starting with understanding what deep learning is and what the various models associated with deep learning are, this book will then show you how to set up the Hadoop environment for deep learning. In this book, you will also learn how to overcome the challenges that you face while implementing distributed deep learning with Hadoop. The book will also show you how you can implement and parallelize Deep Belief Networks, CNN, RNN, RBM and much more using the popular deep learning library deeplearning4j. Get in depth mathematical explanations, visual representations to understand the implementation of Denoising AutoEncoders with deeplearning4j. To give you a more practical perspective, the book will also teach you how you can implement image classification, audio processing and natural language processing on Hadoop. By the end of this book, you will know how to deploy deep learning in distributed systems using Hadoop What you will learn Explore Deep Learning and various models associated with it. Understand the challenges of implementing distributed deep learning with Hadoop and how to overcome it Implement Convolutional Neural Network (CNN) with deeplearning4j Delve into the implementation of Restricted Boltzmann Machines (RBM) Understand the mathematical explanation for implementing Recurrent Neural Networks (RNN) Get hands on practice of deep learning and their implementation with Hadoop. Table of Contents Chapter 1. Introduction to Deep Learning Chapter 2. Distributed Deep Learning for Large-Scale Data Chapter 3. Convolutional Neural Network Chapter 4. Recurrent Neural Network Chapt
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值