显示表信息
SHOW TABLES;
SHOW TABLES 'page.*';
SHOW PARTITIONS page_view;
DESCRIBE page_view;
DESCRIBE EXTENDED page_view;
DESCRIBE EXTENDED page_view PARTITION (ds='2008-08-08');
导入数据
本地导入: LOAD DATA LOCAL INPATH `/tmp/pv_2008-06-08_us.txt` INTO TABLE page_view PARTITION(date='2008-06-08', country='US')
hdfs导入:LOAD DATA INPATH '/user/data/pv_2008-06-08_us.txt' INTO TABLE page_view PARTITION(date='2008-06-08', country='US')
格式:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
Standard syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement Hive extension (multiple inserts):
FROM from_statementINSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1[INSERT OVERWRITE TABLE tablename2 [PARTITION ...] select_statement2] ...
Hive extension (dynamic partition inserts):
INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement
导出
INSERT OVERWRITE TABLE pv_gender_agg SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(DISTINCT pv_users.ip) FROM pv_users GROUP BY pv_users.gender;
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/pv_gender_sum' SELECT pv_gender_sum.* FROM pv_gender_sum;
导出至文件
Standard syntax:
INSERT OVERWRITE [LOCAL] DIRECTORY directory1 SELECT ... FROM ...
Hive extension (multiple inserts):
FROM from_statementINSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1[INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2] ...
修改表
ALTER TABLE old_table_name RENAME TO new_table_name;
ALTER TABLE old_table_name REPLACE COLUMNS (col1 TYPE, ...);
ALTER TABLE tab1 ADD COLUMNS (c1 INT COMMENT 'a new int column', c2 STRING DEFAULT 'def val');
删除表
DROP TABLE pv_users;DROP TABLE [IF EXISTS] table_name
ALTER TABLE pv_users DROP PARTITION (ds='2008-08-08')
创建表时如果使用了EXTERNAL,drop table时不会从文件系统中把文件删除
Hive SHELL命令
Command
Description
quit
Use quit or exit to come out of interactive shell.
set <key>=<value>
Use this to set value of particular configuration variable. One thing to note here is that if you misspell the variable name, cli will not show an error.
set
This will print list of configuration variables that overridden by user or hive.
set -v
This will give all possible hadoop/hive configuration variables.
add FILE <value> <value>*
Adds a file to the list of resources.
list FILE
list all the resources already added
list FILE <value>*
Check given resources are already added or not.
! <cmd>
execute a shell command from hive shell
dfs <dfs command>
execute dfs command command from hive shell
<query string>
executes hive query and prints results to stdout