准备工具
- 一部安装好hadoop和hive的虚拟机
- sqoop压缩包
安装步骤
- 上传解压
- 将mysql的驱动jar包,hive的hive-exec的jar包放到sqoop的lib目录下
- 配置sqoop的环境变量
export SQOOP_HOME=/opt/software/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0
export PATH=$PATH:$SQOOP_HOME/bin
- 使用sqoop远程连接mysql数据库,必须保证mysql数据库允许远程连接
grant all privileges on *.* to 'root'@'%' identified by 'cqrjxk39' with grant option;
flush privileges;
sqoop的使用
- 把mysql的数据导入到hdfs上
sqoop import --connect jdbc:mysql://192.168.1.104:3306/test5 --username root --password cqrjxk39 --table department --target-dir /sqoop/department --fields-terminated-by ","
运行结果:
- 在导入hdfs数据时可以添加where条件
sqoop import --connect jdbc:mysql://192.168.1.104:3306/test5 --username root --password cqrjxk39 --table department --target-dir /sqoop/department2 --fields-terminated-by "," --where "department_name='人事部'";
- 导入到hdfs数据时指定列的数据 Query
sqoop import --connect jdbc:mysql://192.168.1.104:3306/test5 --username root --password cqrjxk39 --target-dir /sqoop/department4 --fields-terminated-by "," --query 'select department_id,department_name from department where department_name="人事部" and $CONDITIONS' --split-by department_id
- 导入mysql数据到hive表中指定的表格
开启Hive
创建表格
create table department(id int,name string) row format delimited fields terminated by ',';
克隆回话:
sqoop import --connect jdbc:mysql://192.168.1.104:3306/test1?characterEncoding=utf8 --username root --password cqrjxk39 --table department --hive-import --hive-database bigdata4 --hive-table department --fields-terminated-by ","
运行结果: