Hive支持以下权限:
开启配置:
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
</property>
示例说明:
1) 创建角色/删除角色
hive> CREATE ROLE role_name;
hive> DROP ROLE role_name;
2) 创建角色/赋权
hive> CREATE ROLE asmp_role;
hive> GRANT ROLE asmp_role TO USER asmp;
hive> GRANT SELECT ON TABLE ods___omdp___omd.tm_model TO ROLE asmp_role;
hive> SELECT * FROM ods___omdp___omd.tm_model;
3) 查看用户对数据库权限/用户对表的权限
hive> SHOW GRANT USER asmp ON DATABASE default;
hive> SHOW GRANT USER asmp ON TABLE default.tm_model;
Hive删除和修复分区:
其中msck repair table是把手动新增的分区文件夹映射到hive表,手动删除的分区文件夹并不生效,需要alter table删除hive分区
#查询是否有数据
select * from tt_repair_part_tmp;
#查看是否有分区
show partitions tt_repair_part_tmp;
#删除分区
alter table tt_repair_part_tmp drop if exists partition(partition_brand='vw');
alter table tt_repair_part_tmp drop if exists partition(partition_brand='skd');
#查看是否有分区
show partitions tt_repair_part_tmp;
#手动新增分区文件夹后,然后修复分区(删除分区文件夹不生效)
msck repair table asmp.tt_repair_part_tmp;
#查看是否有分区
show partitions tt_repair_part_tmp;
Hive -e执行命令:
nohup hive -e "use db;select * from table limit 100;" > ~/log 2>&1 > ~/test &
nohup hive -e "use db;select * from table limit 100;" 1> ~/log 2> ~/errlog &
Hive自定义函数:
add jar /usr/local/hive/test.jar
create temporary function my_udf as 'com.lhx.hive.function.MYUDF';
一行变多行:(explode)
select name,single_alias from test lateral view explode(split(alias,' ')) test_alias as single_alias;
多行变一行:(collect_set)
select name,concat_ws(',',collect_set(subject)) from student group by name;
Hive更改配置文件执行命令:
hive --config /usr/local/hive/conf
或者直接更改临时变量:
export HIVE_CONF_DIR=/usr/local/hive/conf
Hive禁用笛卡尔积方法:set hive.mapred.mode=strict;
Hive设置多个reduce方法:set mapred.reduce.tasks = 20;
开启任务依赖并发执行:set hive.exec.parallel=true;
设置reduce内存大小:
set mapreduce.reduce.memory=4096;
set mapreduce.reduce.java.opts=-Xmx3584m;
map端join默认设置:
set hive.auto.convert.join=true;
set hive.ignore.mapjoin.hint=true;
set hive mapjoin.smalltable.filesize=2500000;
(1) order by/distribute by/sort by/cluster by区别
order by #全局排序
sort by #局部排序,单独reduce中进行排序
distribute by #分桶排序,相同KEY的记录被划分到一个Reduce
cluster by =distribute by+ sort by #分桶排序
cluster by id,name 默认是升序,且不可指定asc或desc
group by #单纯分组,一般和AVG()/COUNT()/MAX()组合
(2)窗口函数
序号函数:row_number() / rank() 111446 / dense_rank()111223
分布函数:percent_rank() / cume_dist()
前后函数:lag() / lead()
头尾函数:first_val() / last_val()
其他函数:nth_value() / nfile()
SELECT id,name,
RANK() OVER(PARTITION BY id ORDER BY dt desc) AS rn1,
DENSE_RANK() OVER(PARTITION BY id ORDER BY dt desc) AS rn2,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY dt DESC) AS rn3
FROM table_test;
Coalesce(expr1, expr2, expr3…… exprn)
–显示第一个非空值
NVL(expr1,expr2)
–如果第一个参数为空那么显示第二个参数的值
NVL2(COMM,0,1)
–如果COMM为空,返回0,如果COMM不为空,返回1
IF(a=a,1,0)
–当条件为TRUE时,返回1;否则返回0
NULLIF(exp1,expr2)
–如果exp1和exp2相等则返回空NULL
Hive建表语句的注释(comment)中文乱码(解决方法:修改hive的元数据信息)
//修改表字段注解和表注解
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
Hive删除表中数据,保留表结构
truncate table table_name;
delete from table_name where 1=1;
(truncate不能删除外部表!外部表里的数据并不是存放在Hive Metastore中)
Hive删除整个表
drop table table_name;
drop table table_name purge;