hive数据类型

JNWsong

于 2020-12-27 22:31:03 发布

阅读量471

点赞数

分类专栏： hive

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/java_creatMylief/article/details/111827620

版权

hive 专栏收录该内容

15 篇文章

订阅专栏

博客介绍了Hive的多种数据类型，包括数字、时间、字符串、布尔、二进制等基本类型，以及array数组、map、struct等复合类型。详细给出了各类型的定义，并通过建表、导入数据和查询的示例，展示了不同类型在Hive中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

数字类型

TINYINT (1字节整数)

SMALLINT (2字节整数)

INT/INTEGER (4字节整数)

BIGINT (8字节整数)

FLOAT (4字节浮点数)

DOUBLE (8字节双精度浮点数)

示例：

create table t_test(a string ,b int,c bigint,d float,e double,f tinyint,g smallint)

时间类型

TIMESTAMP (时间戳) (包含年月日时分秒毫秒的一种封装)

DATE (日期)（只包含年月日）

示例，假如有以下数据文件：

1,zhangsan,1985-06-31

2,lisi,1986-07-10

3,wangwu,1985-08-09

那么，就可以建一个表来对数据进行映射

create table t_customer(id int,name string,birthday date)

row format delimited fields terminated by ',';

然后导入数据

load data local inpath '/root/customer.dat' into table t_customer;

然后，就可以正确查询

字符串类型

VARCHAR(20) (字符串1-65535长度，超长截断)

CHAR (字符串，最大长度255)

其他类型

BOOLEAN（布尔类型）：true false

~~BINARY (二进制)~~：

举例：

1,zs,28,true

2,ls,30,false

3,ww,32,false

4,lulu,18,true

create table t_p(id int,name string,age int,is_married boolean)

select

from t_p where is_married;

复合（集合）类型

array数组类型

arrays: ARRAY<data_type> )

示例：array类型的应用

假如有如下数据需要用hive的表去映射：

战狼2,吴京:吴刚:余男,2017-08-16

三生三世十里桃花,刘亦菲:痒痒,2017-08-20

羞羞的铁拳,沈腾:玛丽:艾伦,2017-12-20

设想：如果主演信息用一个数组来映射比较方便

建表：

create table t_movie(moive_name string,actors array<string>,first_show date)

row format delimited fields terminated by ','

collection items terminated by ':';

导入数据：

load data local inpath '/root/movie.dat' into table t_movie;

查询：

select * from t_movie;

select moive_name,actors[0] from t_movie;

select moive_name,actors from t_movie where array_contains(actors,'吴刚');

select moive_name,size(actors) from t_movie;

map类型

maps: MAP<primitive_type, data_type>

假如有以下数据：

1,zhangsan,father:xiaoming#mother:xiaohuang#brother:xiaoxu,28

2,lisi,father:mayun#mother:huangyi#brother:guanyu,22

3,wangwu,father:wangjianlin#mother:ruhua#sister:jingtian,29

4,mayun,father:mayongzhen#mother:angelababy,26

可以用一个map类型来对上述数据中的家庭成员进行描述

建表语句：

create table t_person(id int,name string,family_members map<string,string>,age int)

row format delimited fields terminated by ','

collection items terminated by '#'

map keys terminated by ':';

查询

select * from t_person;

## 取map字段的指定key的值

select id,name,family_members['father'] as father from t_person;

## 取map字段的所有key

select id,name,map_keys(family_members) as relation from t_person;

## 取map字段的所有value

select id,name,map_values(family_members) from t_person;

select id,name,map_values(family_members)[0] from t_person;

## 综合：查询有brother的用户信息

方式1：

select id,name,father

from

(select id,name,family_members['brother'] as brother from t_person) tmp

where brother is not null;

方式2：

select * from t_map where array_contains(map_keys(family),'sister');

struct类型

struct: STRUCT<col_name : data_type, ...>

假如有如下数据：

1,zhangsan,18:male:beijing

2,lisi,28:female:shanghai

其中的用户信息包含：年龄：整数，性别：字符串，地址：字符串

设想用一个字段来描述整个用户信息，可以采用struct

建表：

create table t_person_struct(id int,name string,info struct<age:int,sex:string,addr:string>)

row format delimited fields terminated by ','

collection items terminated by ':';

查询

select * from t_person_struct;

select id,name,info.age from t_person_struct;

博客等级

码龄8年

203
原创

349
点赞

520
收藏

278
粉丝

关注

私信

热门文章

分类专栏

展开全部收起

上一篇：: hive修改表定义

下一篇：: 修改linux日期

最新评论

Flink 双流Join
JNWsong: 左流的这一条数据的过期时间，每条数据过期时间肯定是不一样的
Flink 双流Join
夏洛_STYLE: left join join成功一次更新过期时间是更新这条数据的过期时间还是所有数据的过期时间
访问https网站，edge浏览器，thisisunsafe不生效
huan_wuhai: 话说为什么会失效呢？
paimon的四种changelog模式（3）-lookup模式
JNWsong: 我也是刚用paimon，我没有针对特定的问题进行过研究，你是批读？还是流读？有没有添加时间旅行hits？你可以按照我的方式，用partial-update建个表，然后写入两条，查看表数据目录中change-log文件的内容，应该是有变化流数据的吧
paimon的四种changelog模式（3）-lookup模式
Rango_lhl: 请教下博主，lookup在加上'merge-engine' = 'partial-update'，想实现按主键来局部更新，但是加上该参数后，在每次文件合并只会输出最终合并后的数据，中间的change-log数据不输出，请问这是该表模式的必然结果嘛，还是说可以进行配置对中间的change-log输出。如按文章的例子，就是在checkpoint完之前写入例子中的两条数据，checkpoint完之后只能读取到2000的那一条数据，期望结果是可以读到+1000那条数据。

最新文章

目录

展开全部

收起

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。