hive中文乱码问题解决
原因:hive的元数据库(mysql等)字符集问题
-
查看hive字符集
-
在hive的元数据库,一般是mysql中执行下面语句
use hive;
-- 修改表字段注解
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
-- 修改表注解
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
-- 修改分区字段注解
alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;
-- 修改索引字段注解
alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
- 另外,hive的连接配置需要检查hive-site.xml的jdbc配置,一般都默认是utf-8
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://IP:3306/db_name?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
或者是在jdbc连接配置代码中加上“&characterEncoding=UTF-8”,比如:
jdbc:mysql://IP:3306/db_name?useUnicode=true&characterEncoding=UTF-8