0. CentOS7 虚拟机安装
(1)配置静态IP地址:
虚拟机网络模式使用NAT模式:
先ping宿主机ip(192.168.100.2)
然后ping www.baidu.com
都能ping通说明网络没有问题
IP(192.168.100.100)宿主机与虚拟机之前的网关IP(192.168.100.2)、
ping外网(baidu.com)都可以通则说明虚拟机固定IP设置成功。
root权限模式下使用:
cd /etc/sysconfig/network-scripts/
vim ifcfg-eth0
为:(提示:按a键进入insert模式,按esc后,按“:wq”保存退出)
TYPE=Ethernet
BOOTPROTO=static ?#设置静态Ip
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
NAME=eno16777736
UUID=4f40dedc-031b-4b72-ad4d-ef4721947439
DEVICE=eno16777736
ONBOOT=yes ?#这里如果为no的话就改为yes,表示网卡设备自动启动
PEERDNS=yes
PEERROUTES=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_PRIVACY=no
GATEWAY=192.168.100.2 ? #这里的网关地址就是第二步获取到的那个网关地址
IPADDR=192.168.100.200 ?#配置ip,在第二步已经设置ip处于192.168.100.xxx这个范围
NETMASK=255.255.255.0 #子网掩码
DNS1=8.8.8.8 #dns服务器1,填写你所在的网络可用的dns服务器地址即可
DNS2=223.5.5.5 #dns服器2
IPADDR=192.168.100.101
NETMASK=255.255.255.0
GATEWAY=192.168.100.2
DNS1=8.8.8.8
DNS2=223.5.5.5
然后执行命令重新启动网卡:
service network restart
重新启动机器:
reboot
图形界面个字符界面切换:
1、查看目前默认的启动默认
命令: systemctl get-default
命令行模式
multi-user.target
图形界面模式
graphical.target
2、按需要修改
设置为图形界面模式
systemctl set-default graphical.target
设置为命令行模式
systemctl set-default multi-user.target
重启 验证
1. 集群准备工作
1)关闭防火墙(进行远程连接)
临时关闭(重启会不管用)
systemctl stop firewalld
永久关闭
systemctl disable firewalld
2)永久修改设置主机名
vi /etc/hostname
注意:需要重启生效->reboot
3)配置映射文件
vi /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.100.100 master
192.168.100.101 hdoop01
192.168.100.102 hdoop02
192.168.100.103 hdoop03
192.168.100.104 hdoop04
4)配置ssh免密登录
生成密钥对
-> ssh-keygen
-> ssh-copy-id master
ssh-copy-id hdoop01
ssh-copy-id hdoop02
ssh-copy-id hdoop03
ssh-copy-id hdoop04
2-> 安装jdk
删除CentOS自带的openJDK:
rpm -qa | grep java
rpm -e --nodeps java-1.7.0-openjdk-headless java-1.7.0-openjdk
rpm -qa | grep java
rpm -e --nodeps javamail javapackages-tools java-1.8.0-openjdk
rpm -qa | grep java
rpm -e --nodeps java-1.8.0-openjdk-headless
rpm -qa | grep java
1)上传tar包
alt+p
2)解压tar包
tar -zxvf jdk
3)配置环境变量
vi /etc/profile
设置java环境变量:
# set java environment
export JAVA_HOME=/usr/local/lib/jdk1.8.0_181
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=$PATH:${JAVA_HOME}/bin
4)发送到其它机器
scp -r /usr/local/lib/jdk1.8.0_181 hadoop01:/usr/local/lib/
scp -r /usr/local/lib/jdk1.8.0_181 hadoop02:/usr/local/lib/
scp -r /usr/local/lib/jdk1.8.0_181 hadoop03:/usr/local/lib/
scp -r /usr/local/lib/jdk1.8.0_181 hadoop04:/usr/local/lib/
scp -r /etc/profile hadoop01:/etc/
scp -r /etc/profile hadoop02:/etc/
scp -r /etc/profile hadoop03:/etc/
scp -r /etc/profile hadoop04:/etc/
注意:加载环境变量 source /etc/profile
3. 安装hadoop
(1)tar -zxvf hadoop-2.7.7.tar.gz -C /root/
(2)cd /root/hadoop-2.7.7/etc/hadoop
(3)vim hadoop-env.sh
修改第25行:export JAVA_HOME=/usr/local/lib/jdk1.8.0_181/
(4)vim core-site.xml
<configuration>
<!--9000是RPC通信的端口-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<!--HDFS数据块和元信息保存在操作系统的目录位置-->
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoop-2.7.4/dfs/data/tmp</value>
</property>
</configuration>
(5)vim hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/hadoop-2.7.7/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/hadoop-2.7.7/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop01:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>1048576</value>
</property>
<!--配置数据块的冗余度,默认是3-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!--是否开启HDFS的权限检查,默认是true-->
<!--使用默认值,后面会改为false-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
(6)vim mapred-site.xml
<configuration>
<!--MR程序运行容器或者框架-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>10</value>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>4</value>
</property>
</configuration>
(7)vim yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<!--NodeManager执行MR任务的方式是Shuffle洗牌-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!--配置Yarn主节点的位置-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!--这个可以不用添加-->
<property>
<name>yarn.nodemanager.hostname</name>
<value>master,hadoop01,hadoop02,hadoop03,hadoop04</value>
</property>
<property>
<name>yarn.nodemanager.webapp.address</name>
<value>master:8042</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
</configuration>
(8)vim slaves
hadoop01
hadoop02
hadoop03
hadoop04
(9)修改/etc/profile
# set java environment
export JAVA_HOME=/usr/local/lib/jdk1.8.0_181
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=$PATH:${JAVA_HOME}/bin
# set hadoop environment
export HADOOP_HOME=/root/hadoop-2.7.7
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# set zookeeper environment
export ZOOKEEPER_HOME=/root/zookeeper-3.4.10
export PATH=$PATH:$ZOOKEEPER_HOME/bin
# set kafka environment
export KAFKA_HOME=/root/kafka_2.11-2.1.0/
export PATH=$PATH:$KAFKA_HOME/bin
# set hbase environment
export HBASE_HOME=/root/hbase-2.1.3
export PATH=$PATH:$HBASE_HOME/bin
# set storm environment
export STORM_HOME=/root/storm-1.1.0
export PATH=$PATH:$STORM_HOME/bin
(9)source /etc/profile
(10)cd hadoop-2.7.7/
mkdir dfs
cd dfs/
mkdir name
mkdir data
(注意data目录下tmp目录不用自己创建)
(11)格式化namenode
cd /root/hadoop-2.7.7/bin
hadoop namenode -format
(10)分发hadoop到其它机器
scp -r /root/hadoop-2.7.7 hadoop01:/root/
scp -r /root/hadoop-2.7.7 hadoop02:/root/
scp -r /root/hadoop-2.7.7 hadoop03:/root/
scp -r /root/hadoop-2.7.7 hadoop04:/root/
(11)分发hadoop环境变量
scp -r /etc/profile hadoop01:/etc
scp -r /etc/profile hadoop02:/etc
scp -r /etc/profile hadoop03:/etc
scp -r /etc/profile hadoop04:/etc
注意:加载环境变量 source /etc/profile
(12)单个节点 启动namenode
hadoop-daemon.sh start namenode
启动datanode
`hadoop-daemon.sh start datanode`
访问namenode提供的web端口:50070
执行启动命令:
start-dfs.sh
start-yarn.sh
停止命令:
stop-dfs.sh
stop-yarn.sh
Hdfs和yarn全部启动:start-all.sh
yran的web界面:8088
Hdfs的界面:50070
4. 安装zookeeper
(1)tar -zxvf zookeeper-3.4.10.tar.gz -C /root/zookeeper-3.4.10
(2)/root/zookeeper-3.4.10/conf
(3)cp zoo_sample.cfg zoo.cfg
(4)vim zoo.cfg
(5)修改
dataDir=/root/zookeeper-3.4.10/zkData
###############cluster###############
server.0=master:2888:3888
server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888
server.4=hadoop04:2888:3888
(6)cd /root/zookeeper-3.4.10
(7)mkdir zkData
cd zkData/
touch myid
vim myid
0
(注意修改其他机器的id为1 2 3 4)
(8)发送zookeeper文件到其它机器
scp -r zookeeper-3.4.10/ hadoop01:/root/
scp -r zookeeper-3.4.10/ hadoop02:/root/
scp -r zookeeper-3.4.10/ hadoop03:/root/
scp -r zookeeper-3.4.10/ hadoop04:/root/
(9)注意配置了环境变量:
启动:zkServer.sh start
停止:zkServer.sh stop
5. 安装hive
(1)tar -zxvf apache-hive-2.3.4-bin.tar.gz -C /root/hive-2.3.4
(2)cd /root/hive-2.3.4/conf
(3)cp hive-env.sh.template hive-env.sh
(4)vim hive-env.sh
修改:
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/root/hadoop-2.7.7
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/root/hive-2.3.4/conf
(5)修改hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>print the table head info while select database</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
(5)把MySQL-controller-java-5.1.39
的jar包上传到hive的lib目录下
(6)在hdfs上创建文件夹
hdfs dfs -mkdir /tmp
hdfs dfs -chmod 777 /tmp
hdfs dfs -mkdir -p /user/hive/warehouse/
(7)若在hive1.x中已使用MySQL作为元数据库, 升级到hive2.x时仍想将MySQL作为元数据库则进行以下操作: (以下方法是删除原始在hive1.x中的元数据, 并不是原始数据迁移)
1.删除HDFS上的hive数据与hive数据库
hadoop fs -rm -r -f /tmp/
hadoop fs -rm -r -f /user/hive
2.若是用MySQL作为元数据库, 则删除MySQL上的Hive的元数据信息
mysql -uroot -p
drop database metastore
Hive 2.x 需要 手动 初始化 Hive 在mysql的元数据信息
[victor@hadoop02 hive]$ bin/schematool -dbType mysql -initSchema
(8)启动hive:
cd hive-2.3.4/
bin/hive
6. 安装flume
(1)tar -zxvf apache-flume-1.9.0-bin.tar.gz -C /root/flume-1.9.0
(2)cd /root/flume-1.9.0/conf
(3) cp flume-env.sh.template flume-e
nv.sh(4)
vim flume-env.sh修改:
export JAVA_HOME=/usr/local/lib/jdk1.8.0_181`
(5)完成
7. 安装hbase
(1)tar -zxvf hbase-2.1.3-bin.tar.gz -C /root/hbase-2.1.3
(2)cd /root/hbase-2.1.3/conf
(3)vim hbase-env.sh
(4)修改:
export JAVA_HOME=/usr/local/lib/jdk1.8.0_181
export HBASE_MANAGES_ZK=false
(5)vim hbase-site.xml
修改:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 0.98后的新变动,之前版本没有.port,默认端口为60000 -->
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<!-- zookeeper集群的位置 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>master:2181,hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181</value>
</property>
<!-- hbase的元数据信息存储在zookeeper的位置 -->
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/root/zookeeper-3.4.10/zkData</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
</property>
</configuration>
(6)vim regionserver
hadoop01
hadoop02
hadoop03
hadoop04
(6)把相关版本的zookeeper和hadoop的依赖包导入到hbase/lib下,注意hbase的lib目录下hadoop的jar包和zookeeper的jar包
(7)ln -s /root/hadoop-2.7.7/etc/hadoop/core-site.xml /root/hbase-2.1.3/conf/
(8)ln -s /root/hadoop-2.7.7/etc/hadoop/hdfs-site.xml /root/hbase-2.1.3/conf/
(9)发送到其他机器
scp -r hbase-2.1.3/ hadoop01:/root/
scp -r hbase-2.1.3/ hadoop02:/root/
scp -r hbase-2.1.3/ hadoop03:/root/
scp -r hbase-2.1.3/ hadoop04:/root/
(10)启动:
bin/hbase-daemon.sh start master/regionserver
bin/hbase-daemon.sh stop master/regionserver
HBase的shell启动:
bin/hbase shell
(11)web界面:16010
8. 安装kafka
(1)tar -zxvf kafka_2.11-2.1.0.tgz -C /root/kafka_2.11-2.1.0
(2)cd /root/kafka_2.11-2.1.0/conf
(3)vim server.properties
修改:
broker.id=0 #每台机器的id不同即可
zookeeper.connect=master:2181,hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181
添加:
delete.topic.enable=true #是否允许删除主题
log.dirs=log.dirs=/root/kafka_2.11-2.1.0/logs #运行日志保存位置
(4)配置kafka环境变量
(5)发送到其他机器:
scp -r kafka_2.11-2.1.0/ hadoop01:/root/
scp -r kafka_2.11-2.1.0/ hadoop02:/root/
scp -r kafka_2.11-2.1.0/ hadoop03:/root/
scp -r kafka_2.11-2.1.0/ hadoop04:/root/
9. 安装flink
(1)tar -zxvf flink-1.7.1-bin-hadoop27-scala_2.11.tgz -C /root/flink-1.7.1
(2)cd /root/flink-1.7.1/
(2)启动:bin/start-cluster.sh
停止:bin/stop-cluster.sh
(3)Web界面:8081
(4)集群配置:
vim flink-conf.yaml
修改:jobmanager.rpc.address: master
(5)vim slaves
hadoop01
hadoop02
hadoop03
hadoop04
(6)发送到其他机器中
scp -r flink-1.7.1/ hadoop01:/root/
scp -r flink-1.7.1/ hadoop02:/root/
scp -r flink-1.7.1/ hadoop03:/root/
scp -r flink-1.7.1/ hadoop04:/root/
10. 安装spark
(1)tar -zxvf spark-2.2.3-bin-hadoop2.7.tgz -C /root/spark-2.2.3/
(2)cd /root/spark-2.2.3/conf
(3)cp spark-env.sh.template spark-env.sh
(4)vim spark-env.sh
(5)在末尾添加:
##########################################Set-Cluster##############################################
export JAVA_HOME=/usr/local/lib/jdk1.8.0_181
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
(6)cp slaves.template slaves
(7)vim slaves
hadoop01
hadoop02
hadoop03
hadoop04
(8)发送到其他机器
scp -r spark-2.2.3/ hadoop01:/root/
scp -r spark-2.2.3/ hadoop02:/root/
scp -r spark-2.2.3/ hadoop03:/root/
scp -r spark-2.2.3/ hadoop04:/root/
(9)集群启动:
cd /root/spark-2.2.3/
启动:sbin/start-all.sh
停止:sbin/stop-all.sh
(10)web界面:8080
11. zookeeper和kafka快速启动脚本
#!/bin/sh
for host in master hadoop01 hadoop02 hadoop03 hadoop04
do
ssh $host "source /etc/profile;/root/zookeeper-3.4.10/bin/zkServer.sh start"
echo "$host zk is running"
done
#!/bin/sh
for host in master hadoop01 hadoop02 hadoop03 hadoop04
do
echo "$host zk is stopping"
ssh $host "source /etc/profile;/root/zookeeper-3.4.10/bin/zkServer.sh stop"
done
#!/bin/bash
BROKERS="master hadoop01 hadoop02 hadoop03 hadoop04"
APPHOME="/root/kafka_2.11-1.1.0"
APP_NAME="kafka_2.11-1.1.0"
for i in $BROKERS
do
echo "Starting ${APP_NAME} on ${i} "
ssh ${i} "source /etc/profile; nohup sh ${APPHOME}/bin/kafka-server-start.sh ${APPHOME}/config/server.properties > /dev/null 2>&1 &"
if [[ $? -ne 0 ]]; then
echo "Starting ${APP_NAME} on ${i} is ok"
fi
done
echo All $APP_NAME are started
exit 0
(4)stop-kafka.sh (需要修改kafka的bin目录下kafka-server-stop.sh)
#!/bin/bash
BROKERS="master hadoop01 hadoop02 hadoop03 hadoop04"
APPHOME="/root/kafka_2.11-1.1.0"
APP_NAME="kafka_2.11-1.1.0"
for i in $BROKERS
do
echo "Stopping ${APP_NAME} on ${i} "
ssh ${i} "source /etc/profile;bash ${APPHOME}/bin/kafka-server-stop.sh"
if [[ $? -ne 0 ]]; then
echo "Stopping ${APP_NAME} on ${i} is down"
fi
done
echo All $APP_NAME are stopped
exit 0
12. 修改Zookeeper日志输出路径,并按照日期输出
(1)1、修改log4j.properties
vim /opt/zookeeper-3.4.8/conf/log4j.properties
# Define some default values that can be overridden by system properties
zookeeper.root.logger=INFO,ROLLINGFILE
zookeeper.console.threshold=INFO
zookeeper.log.dir=.
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=DEBUG
zookeeper.tracelog.dir=.
zookeeper.tracelog.file=zookeeper_trace.log
…
…
#
# Add ROLLINGFILE to rootLogger to get log file output
# Log DEBUG level and above messages to a log file
# 按照日期每天输出logs
log4j.appender.ROLLINGFILE=org.apache.log4j.DailyRollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}
# Max log file size of 10MB
#log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
#log4j.appender.ROLLINGFILE.MaxBackupIndex=10
(3)修改bin/zkEnv.sh
if [ "x${ZOO_LOG_DIR}" = "x" ]
then
#日志输出路径 不需mkdir zookeeper启动时自动创建
ZOO_LOG_DIR="/root/zookeeper-3.4.10/logs"
fi
if [ "x${ZOO_LOG4J_PROP}" = "x" ]
then
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
fi
13. 安装Pig
两种运行的模式
1、本地模式:操作本地Linux的文件系统、类似Hadoop的本地模式
命令: pig -x local
日志: Connecting to hadoop file system at: file:///
2、MapReduce模式(集群模式): 把PigLatin语句转换成MapReduce提交Hadoop上运行
日志:Connecting to hadoop file system at: hdfs://bigdata111:9000
配置一个环境变量:vim /etc/profile
PIG_CLASSPATH ---> 指向Hadoop的配置文件的目录
PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop
export PIG_CLASSPATH
14. 安装sqoop
1)下载安装包
2)解压
tar -zxvf .tar
3)修改配置
vi sqoop-env.sh
export HADOOP_COMMON_HOME=/root/hd/hadoop-2.8.4
export HADOOP_MAPRED_HOME=/root/hd/hadoop-2.8.4
export HIVE_HOME=/root/hd/hive
export ZOOCFGDIR=/root/hd/zookeeper-3.4.10/conf
4)发送mysql驱动到lib下
5)检测是否安装成功
bin/sqoop help
6)Sqoop的import命令
1)数据从mysql中导入到hdfs当中
bin/sqoop import --connect jdbc:mysql://192.168.50.183:3306/sq --username
root --password root --table user --target-dir /sqoop/datas --delete-tar
get-dir --num-mappers 1 --fields-terminated-by "\t"
2)数据mysql中导入到hdfs当中进行筛选
bin/sqoop import --connect jdbc:mysql://192.168.50.183:3306/sq --username
root --password root --target-dir /sqoop/selectdemo --delete-target-dir
--num-mappers 1 --fields-terminated-by "\t" --query 'select * from user w
here id<=1 and $CONDITIONS'
3)通过where筛选
bin/sqoop import --connect jdbc:mysql://192.168.50.183:3306/sq --username
root --password root --target-dir /sqoop/selectdemo2 --delete-target-dir
--num-mappers 1 --fields-terminated-by "\t" --table user --where "id<=1"
4)mysql导入到hive
需要先创建hive表
bin/sqoop import --connect jdbc:mysql://hd09-01:3306/sq --username root -
-password root --table user1 --num-mappers 1 --hive-import --fields-termi
nated-by "\t" --hive-overwrite --hive-table user_sqoop
问题:hiveconf
解决:
vi ~/.bash_profile
export HADOOP_CLASSPATH=$HADOOP_CLASSPASS:/root/hd/hive/lib/*
mysql权限问题:
grant all privileges on *.* to root@'%' identified by "password";
flush privileges;
15. 安装storm
1)准备工作
zk01 zk02 zk03
storm01 storm02 storm03
2)下载安装包
http://storm.apache.org/downloads.html
3)上传
4)解压
5)修改配置文件
$ vi storm.yaml
设置Zookeeper的主机名称
storm.zookeeper.servers:
- "hd09-01"
- "hd09-02"
- "hd09-03"
设置主节点的主机名称
nimbus.seeds: ["hd09-01"]
设置Storm的数据存储路径
storm.local.dir: "/root/hd/storm/data"
设置Worker的端口号
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- ```
6)启动nimbus
$ `storm nimbus &`
7) 启动supervisor
$ `storm supervisor &`
8)启动ui界面
$ `storm ui`
16.安装oozie
参考:https://blog.youkuaiyun.com/weixin_42003671/article/details/86625359