sqoop使用心得

目前笔者趟了sqoop的坑,把mysql的数据导入hive

1. sqoop版本选取

    笔者最先尝试的是sqoop1.9,1.9的sqoop分为两个部分,server端和client端,官网说还有web端,server端是一个常驻进程,负责connect,job的管理和执行,client端可以远程连接server,向 server提交请求,和获取server状态,目前的版本不支持数据导入hive,所以选取1.4.6版本,其实两个版本都是把数据导入HDFS临时文件夹再move到hive目录下,1.9需要hive手动从hdfs load

2. sqoop安装配置

    直接从官网下载可运行的tar包解压到本地,配置sqoop-env.sh文件,配置如下

▽
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/work/hadoop/hadoop-2.7.3


export ZOOKEEPER_HOME=/home/work/hadoop/zookeeper-3.4.9
export HIVE_CONF_DIR=/home/work/hadoop/hive-2.1.1/conf
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/work/hadoop/hadoop-2.7.3/share/hadoop/mapreduce

#set the path to where bin/hbase is available
export HBASE_HOME=/home/work/hadoop/hbase-1.2.4

#Set the path to where bin/hive is available
#export HIVE_HOME=

#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/work/hadoop/zookeeper-3.4.9/conf

3. 创建表

Create table test (id int, username string,passwd string)
Row format delimited fields terminated by ‘\t’
Lines terminated by ‘\n’
Stored as textfile

4.数据导入

./sqoop import --connect jdbc:mysql://10.33.240.93:3308/test --username hadoop --password 7897  --query "select user_name ,user_id,identype from users where \$CONDITIONS" --hive-import --hive-database test_odbc --hive-table users --split-by user_id --fields-terminated-by '\01' --lines-terminated-by '\n' --target-dir /user/hive/tmp/users --hive-delims-replacement ' ' —create-hive-table

5. 问题

  • 数据导入完后发现数据会增多,因为hive的数据文件的列分隔符和行分隔符被包含在了字段中,可以用--hive-delims replacement ' ' 这个变量用比的字符代替字段中的分割符或用--hive-drop-import-delims删掉字段中的分割符
  • mysql用户权限问题,中途报1045错误
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值