Hive常用命令大全（客观请收藏）

最新推荐文章于 2024-07-09 14:59:12 发布

偷捧时间煮雨

最新推荐文章于 2024-07-09 14:59:12 发布

阅读量3.5k

点赞数 3

本文链接：https://blog.youkuaiyun.com/weixin_40461486/article/details/126223584

版权

大数据同时被 2 个专栏收录

4 篇文章

订阅专栏

数据库

2 篇文章

订阅专栏

Hive常用命令大全

1.进入hive数据库

hive

2.查看数据库

show databases;

3.进入数据库

use 数据库;

4.查看所有表

show tables;

5.显示表结构

desc 表名;

6.查询表数据

select * from 表名;

7.显示表分区

show partitions 表名;

8.创建数据库

CREATE SCHEMA userdb;

9.删除数据库

DROP DATABASE IF EXISTS userdb;
DROP SCHEMA userdb;

10.创建数据表

use userdb;
create table xxx;

创建一个表，结构与其他一样

create table xxx like xxx;

创建一个表，结构数据与其他一样，相当于复制一个表

create table xxx as xxx;

创建内部表（指定分割符为tab键）

create table tb_name(name1 int,name2 string) row format delimited fields terminated by '\t';

创建外部表（指定分隔符为tab键）

create external table tb_name(name1 int,name2 string) row format delimited fields terminated by '\t';

创建分区表（分区依据 id int）

create table tb_name(
	id int,
	name string
) partitioned by (Id int) 
	row format delimited fields terminated by '\t';

普通表和分区表区别：有大量数据增加的需要建分区表

内外表转换
内部表转外部表

alter table table-name set TBLPROPROTIES('EXTERNAL'='TURE');

外部表转内部表

alter table table-name set TBLPROPROTIES('EXTERNAL'='FALSE');

删除分区
注意：若是外部表，则还需要删除文件（hadoop fs -rm -r -f hdfspath）

alter table table_name drop if exists partitions (d='2016-07-01');

11.加载数据列表

把本地数据装载到数据表，也就是在metastore上创建信息

load data local inpath '/root/a.txt' into table tb_name;

把HDFS上的数据装载到数据表

load data inpath '/target.txt' into table tb_name;

加载数据到分区表必须指明所属分区

load data local inpath './book.txt' overwrite into table tb_name partition (Id = 10);

12.重命名表名

ALTER TABLE 表名1 RENAME TO 表名2;

13.删除表

drop table 表名;
drop table if exists 表名;

14.插入表数据

向有分区的表插入数据
（1）覆盖现有分区数据，如果没有该指定分区，新建该分区，并且插入数据

INSERT OVERWRITE TABLE 库名.表名 PARTITION(dt='2018-09-12',name='Tom', ...)
SELECT ... FROM 库名.表名 where...

（2）向现有的分区插入数据 (之前的数据不会被覆盖)

INSERT INTO TABLE 库名.表名 PARTITION(dt='2018-09-12',name='Tom',...)
SELECT ... FROM 库名.表名  WHERE ...

向无分区的表插入数据
(1) 覆盖原有表里的数据，命令和有分区的表类似，只是去掉后面的PARTITION（dt=’ ‘,name=’ '）

INSERT OVERWRITE TABLE 库名.表名 
SELECT ... FROM 库名.表名 where...

(2) 向现有的表插入数据 (之前的数据不会被覆盖)

INSERT INTO TABLE 库名.表名 
SELECT ... FROM 库名.表名  WHERE ...

15.表结构修改

增加字段

alter table table_name add columns(newscol1 int conment '新增')；

修改字段

alter table table_name change col_name new_col_name new_type;

删除字段
删除字段（COLUMNS中只放保留的字段）

alter table table_name replace columns(col1 int,col2 string,col3string);

16.字段类型

tinyint ，smallint，int，bigint，float，decimal，boolean，string

17.复合数据类型

struct，array，map

18.分桶表

对于每一个表或者分区，Hive可以进一步组织成桶，也就是说桶是更为细精度的数据范围划分。
桶的使用一定要设置如下属性：

hive.enforce.bucketing = true;

创建一个桶：

# 按（id）分为4个bucket 
create table tb_name ( 
	id int, 
	name string 
) clustered by (id) into 4 buckets 
	row format delimited fields terminated by ',';

通过子查询插入数据

insert into tb_name1 select * from tb_name;

19.创建一个视图

create view v_name as 
	select table1.column1, table2.column2, table3.column3 
	where table1.column1 = table2.column2;

总结

以上就是全部关于Hive的操作指令，欢迎学习，共同成长！