hive解释json格式成表及查询

在hive中对于json的数据格式,可以使用get_json_object或json_tuple先解析然后查询。

(转自:https://blog.youkuaiyun.com/li_canhui/article/details/85257859

(1)在Hive表中,如果某个字段是json类型,那么我们可以使用json函数来提取里面的值,具体的语法为

get_json_object(column_name, '$.key_name')

(2)如果是多层嵌套的json结构,那么语法如下

get_json_object(column_name, '$.first_level_key_name.second_level_key_name')

 

也可以直接在hive中创建json格式的表结构,这样就可以直接查询,实战如下(hive-2.3.0版本):

(转自:https://www.cnblogs.com/30go/p/8318542.html

1. 准备数据源

将以下内容保存为test.txt

{"student":{"name":"king","age":11,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}
{"student":{"name":"wang","age":12,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}
{"student":{"name":"test","age":13,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}
{"student":{"name":"test2","age":14,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}
{"student":{"name":"test3","age":15,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}
{"student":{"name":"test4","age":16,"sex":"M"},"class":{"book":"语文","level":2,"score":80},"teacher":{"name":"t1","class":"语文"}}

2. 创建hive表

首先需要引入json的hive解析包
我使用的是cdh5.13.3,在这里下载了hive-hcatalog-core的包
hive-hcatalog-core下载地址
hive里是使用命令添加jar包

add jar hdfs:///user/hive/jars/json-serde-1.3.8-jar-with-dependencies.jar;

添加了之后便可根据json的内容建表了
注意serde格式大小写不能写错: org.apache.hive.hcatalog.data.JsonSerDe

create external table if not exists dw_stg.student(
student map<string,string> comment "学生信息",
class map<string,string> comment "课程信息",
teacher map<string,string> comment "授课老师信息"
)
comment "学生课程信息"
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as textfile;

3. 上传数据

将test.txt上传到刚才创建的student目录

hdfs dfs -put test.txt /user/hive/warehouse/dw_stg.db/student/

4. 使用hql查询

查询所有信息记录:

查询字段student信息

查询字段class信息

查询学生姓名为test4的所有记录

取json串中某个值可以使用 student['name'] ,如下:

select 
    student['name'] as stuName,
    class['book'] as cls_book, 
    class['score'] as cls_score,
    teacher['name'] as tech_name 
from student 
where student['name'] = 'test4';

总体看起来,比使用get_json_object或json_tuple解析方便多了

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值