前言
接上一篇数据采集之Web端上传文件到Hadoop HDFS,总共需求有3个,这篇记录如何通过Web端将MySQL表数据导入到HDFS中,主要是通过Sqoop2这个工具,之前已经写了一篇 Sqoop2 从MySQL导入数据到Hadoop HDFS,不过那个是在命令行下操作的。
这回通过Java API的形式操作,其中还是有不少坑的。
环境
- OS Debian 8.7
- Hadoop 2.6.5
- SpringBoot 1.5.1.RELEASE
- MySQL 5.7.17 Community Server
- Sqoop 1.99.7
项目依赖
废话不多说,直接先上pom.xml依赖文件。
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.infosys.sqoop</groupId>
<artifactId>sqoop</artifactId>
<version>1.0-SNAPSHOT</version>
<name>sqoop</name>
<packaging>jar</packaging>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.5.1.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<hadoop.version>2.6.5</hadoop.version>
<sqoop.version>1.99.7</sqoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>${mysql.version}</version>
</dependency>
<dependency>
<groupId>org.apache.sqoop</groupId>
<artifactId>sqoop-client</artifactId>
<version>${sqoop.version}</version>
<exclusions>
<exclusion>