hivesql常见的与不常见的日期格式转换 (持续更新)

最新推荐文章于 2025-05-07 11:14:55 发布

原创

最新推荐文章于 2025-05-07 11:14:55 发布 · 4.9k 阅读

25 ·

CC 4.0 BY-SA版权

文章标签：

#mysql #nosql #sql #hive

本文记录了在Hive中处理日期格式转换的各种方法，包括将非标准格式如'03/16/yyyy HH:mm:ss'转换为标准格式，以及处理不同长度的时间戳。还探讨了从毫秒时间戳到日期的转换，如除以1000或字符串切割，并提到了可能遇到的问题和解决方案。目前内容更新至20200306，后续会补充更多日期转换示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

12/31/2015 00:00:00
12/31/2015 00:00:00
1/4/2016 00:00:00
1/4/2016 00:00:00

故事起源于某日接手了 ↑ 这种日期格式的数据 mm/dd/yyyy HH:mm:ss ，

需要将数据放进hive里，肯定得转成标准格式呀。baidu了很久没找到所需函数，只找到了类似的：

16/Mar/2017:12:25:01 +0800 转成正常格式（yyyy-MM-dd hh:mm:ss）

select from_unixtime(to_unix_timestamp('16/Mar/2017:12:25:01 +0800', 'dd/MMM/yyy:HH:mm:ss Z'))

所以本文将持续记录各种常用/不常用的各类日期转换函数。(后期视情况可能会加入自定义函数udf)

[ps:当前更新日期20200306]

首先，mm/dd/yyyy HH:mm:ss 这种格式，You can try this:

select from_unixtime(unix_timestamp('9/26/2011 00:00:00' ,'MM/dd/yyyy HH:mm:ss'), 'yyyy-MM-dd HH:mm:ss');

timstamp毫秒时间戳(13位) 格式化：

毫秒时间戳是13位的，所以可以选择两种方法：1.除以1000，再格式化

select from_unixtime(cast(updata_time/1000 as bigint), 'yyyy-MM-dd') date
from access_log;

2.字符串切割掉后3位，再格式化

select from_unixtime(cast(substring(updata_time, 1, 10) as bigint),'yyyy-MM-dd HH') date
from access_log;

因为updata_time为bigint格式，需要转化为秒，结果会变成double格式，所以需要用cast转换

from_unixtime(cast(updata_time/1000 as bigint)) as updata_time

hive里不填写格式会默认转化为标准时间格式，所以也可以写成自己所需格式：

from_unixtime(cast(updata_time/1000 as bigint),'yyyy-MM-dd HH:mm:ss') as updata_time

///描述一种比较坑的情况：

时间戳转时间：数据中我们希望看到的是 yyyy-MM-dd HH:mm:ss，然而实际传来的数据可能是‘1234567890’，或者‘1234567890’

from_unixtime(cast(SUBSTR(updata_time, 1, 10) as bigint) , 'yyyy-MM-dd HH:mm:ss')

这行代码即可，槽点在于输入是10位的 bigint or double (有时候时间戳是按毫秒级，所以会有13位)：

1.如果输入 updata_time 是 int or double 类型：直接 updata_time/1000 即可；

2.如果输入是string，则： cast(SUBSTR(updata_time, 1, 10) as bigint

时间格式转换：

from_unixtime(unix_timestamp('2018-11-05','yyyy-MM-dd'),'yyyy/MM/dd')

date_format('2015-04-08 10:10:01', 'yyyy-MM')

concat(substr('2018-09-05',1,4),substr('2018-09-05',6,2),substr('2018-09-05',9,2))

（时间仓促，有时间了再补上这三种方法的说明）

常用日期格式转换part1:

固定日期转换成时间戳
select unix_timestamp('2016-08-16','yyyy-MM-dd') --1471276800
select unix_timestamp('20160816','yyyyMMdd') --1471276800
select unix_timestamp('2016-08-16T10:02:41Z', "yyyy-MM-dd'T'HH:mm:ss'Z'") --1471312961

16/Mar/2017:12:25:01 +0800 转成正常格式（yyyy-MM-dd hh:mm:ss）
select from_unixtime(to_unix_timestamp('16/Mar/2017:12:25:01 +0800', 'dd/MMM/yyy:HH:mm:ss Z'))

时间戳转换程固定日期
select from_unixtime(1471276800,'yyyy-MM-dd') --2016-08-16
select from_unixtime(1471276800,'yyyyMMdd') --20160816
select from_unixtime(1471312961) --    2016-08-16 10:02:41
select from_unixtime( unix_timestamp('20160816','yyyyMMdd'),'yyyy-MM-dd')  --2016-08-16
select date_format('2016-08-16','yyyyMMdd') --20160816

返回日期时间字段中的日期部分
select to_date('2016-08-16 10:03:01') --2016-08-16
取当前时间
select from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:mm:ss')
select from_unixtime(unix_timestamp(),'yyyy-MM-dd') 
返回日期中的年
select year('2016-08-16 10:03:01') --2016
返回日期中的月
select month('2016-08-16 10:03:01') --8
返回日期中的日
select day('2016-08-16 10:03:01') --16
返回日期中的时
select hour('2016-08-16 10:03:01') --10
返回日期中的分
select minute('2016-08-16 10:03:01') --3
返回日期中的秒
select second('2016-08-16 10:03:01') --1

返回日期在当前的周数
select weekofyear('2016-08-16 10:03:01') --33

返回结束日期减去开始日期的天数
select datediff('2016-08-16','2016-08-11') 

返回开始日期startdate增加days天后的日期
select date_add('2016-08-16',10)

返回开始日期startdate减少days天后的日期
select date_sub('2016-08-16',10)

返回当天三种方式
SELECT CURRENT_DATE;
--2017-06-15
SELECT CURRENT_TIMESTAMP;--返回时分秒
--2017-06-15 19:54:44
SELECT from_unixtime(unix_timestamp());
--2017-06-15 19:55:04
返回当前时间戳
Select current_timestamp--2018-06-18 10:37:53.278

返回当月的第一天
select trunc('2016-08-16','MM') --2016-08-01
返回当年的第一天
select trunc('2016-08-16','YEAR') --2016-01-01

part2:

--把 20180123 转换成 2018-01-23，可以使用：
select from_unixtime(unix_timestamp('${p_date}','yyyymmdd'),'yyyy-mm-dd') from table;

1.日期函数UNIX时间戳转日期函数
from_unixtime语法:from_unixtime(bigint unixtime[, stringformat]) 
返回值: string 说明: 转化UNIX时间戳（从1970-01-0100:00:00 UTC到指定时间的秒数）到当前时区的时间格式
举例：
select from_unixtime(1323308943,'yyyyMMdd') from dual;
20111208

2.获取当前UNIX时间戳函数: unix_timestamp语法:   unix_timestamp() 
返回值: bigint