Hive常用函数

最新推荐文章于 2025-06-05 09:48:02 发布

原创最新推荐文章于 2025-06-05 09:48:02 发布 · 1.9k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#hive

HiveSQL学习专栏收录该内容

13 篇文章

订阅专栏

本文详细介绍SQL中的IF函数、CASE条件判断函数、NVL函数、trim系列函数等，并讲解了正则表达式替换函数、字符串连接函数、日期时间转换函数、左补足与右补足函数及排序函数的使用方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

IF函数：if

语法: if(testCondition,valuTrue,valueFalseOrNull)
说明:当条件 testCondition 为True时，返回valueTrue, 否则返回valueFalseOrNull

2322

条件判断函数：CASE

语法： Case A when b THEN c [WHEN d THEN e] * [ELSE f] END
说明：如果a 等于 b，那么返回c,如果a等于d，那么返回e，否则返回f

eg：
 -- 销售类型增加一级部门的prt_distribution_flag_name：直销为 -1，分销为-2
select
case t1.prt_distribution_flag_name When '直销' then '-1' when '分销' then '-2' else '未知' end as parent_dept_id,
t2.distribution_flag_key as dept_id,
t1.*
from
(select
     distinct 
     lvl,
     full_name,     
     prt_distribution_flag_name,
     distribution_flag_name
from dw.ol_power_distribution_type_dim 
where lvl =2 and dt ='20180813') t1 
left join 
dw.kn1_dim_distribution t2 on t1.distribution_flag_name =t2.distribution_flag_name 
and t1.prt_distribution_flag_name = t2.prt_distribution_flag_name

踩坑：Query 20180815_101245_23113_5iy2v failed: line 2:77: All CASE results must be the same type: bigint类型要一致

22222222222222222

NVL

语法：nvl（value，default_value）
说明：如果value是null，返回default_value）

去空格函数：trim，ltrim,rtrim

语法：
trim(A)：去除字符串两边的空格
ltrim(A):去除字符串左边的空格
rtrim(A):去除字符串右边的空格

2323232

正则表达式替换函数：regexp_replace

语法：regexp_replace（A,B,C）说明：将字符串A中的符合java正则表达式B的部分替换成C。
注意：在有些情况下要使用转义字符，类似oracle中的regexp_replace函数

23232

字符串连接函数：concat

语法： concat（str1,str2,str3,…strN）
说明：返回输入字符串连接后的结果，支持任意个输入字符串。

日期时间转日期函数： to_date

语法：to_date(expr)
说明:返回日期时间段内的日期部分。`

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fwtmkAjB-1600857848133)(img/to_date.png)]

左补足函数：lpad

语法：lpad(str,len,pad)
说明:将str用pad进行左补足到len位

2232

右补足函数： rpad

语法：rpad(str,len,pad)
说明:将str用pad进行右补足到len位

23232

排序函数：row_number() Over()

语法：
row_number() over (partition by xxx order by xxx) rank（其中rank为分组的别名，你也可以换个名字比方说换成hahahah）
说明：从 1开始，按照顺序生成分组内记录的开始。

使用事例

数据提取目标：从tmp_test表中根据col1字段去重，选取clo2最大的那条记录，导入tmp_test_c表。

创建数据表：
    create table tmp_test(col1 string,clo2 string);

添加测试数据：
    insert into table tmp_test
    select 1,'str1' from dual
    union all
    select 2,'str2' from dual
    union all
    select 3,'str3' from dual
    union all
    select 3,'str31' from dual
    union all
    select 3,'str33' from dual
    union all
    select 4,'str41' from dual
    union all
    select 4,'str42' from dual;

查看数据：

232

使用row_number（）函数查询数据：

hive (default)> select t.*,row_number() over(partition by col1,clo2 sort by clo2 desc) rn
              > from tmp_test t;

23232

按照数据提取目标提取数据：

hive (default)> select *
              > from
              > (
              > select t.*,row_number() over(distribute by col1 sort by clo2 desc) rn
              > from tmp_test t
              > )tt
              > where tt.rn=1;