HIVE关键字清单

这篇博客汇总了Hive的关键字,包括非保留和保留关键字,并指出在Hive 2.0.0之后,REGEXP和RLIKE变为保留关键字。同时,文章提供了在使用这些关键字时的解决方案,如使用引用标识符或设置hive.support.sql11.reserved.keywords=false。

因项目需要识别出Hive的关键字,特从Hive官网中扣下来的。如坚持使用关键字,文章最末有解决方案。

Keywords, Non-reserved Keywords and Reserved Keywords

Keywords

ADD,ADMIN,AFTER,ALL,ALTER,ANALYZE,AND,ARCHIVE,ARRAY,AS,ASC,AUTHORIZATION,BEFORE,BETWEEN,
BIGINT,BINARY,BOOLEAN,BOTH,BUCKET,BUCKETS,BY,CASCADE,CASE,CAST,CHANGE,CHAR,CLUSTER,
CLUSTERED,CLUSTERSTATUS,COLLECTION,COLUMN,COLUMNS,COMMENT,COMPACT,COMPACTIONS,COMPUTE,
CONCATENATE,CONF,CONTINUE,CREATE,CROSS,CUBE,CURRENT,CURRENT_DATE,CURRENT_TIMESTAMP,CURSOR,
DATA,DATABASE,DATABASES,DATE,DATETIME,DAY,DBPROPERTIES,DECIMAL,DEFERRED,DEFINED,DELETE,
DELIMITED,DEPENDENCY,DESC,DESCRIBE,DIRECTORIES,DIRECTORY,DISABLE,DISTINCT,DISTRIBUTE,
DOUBLE,DROP,ELEM_TYPE,ELSE,ENABLE,END,ESCAPED,EXCHANGE,EXCLUSIVE,EXISTS,EXPLAIN,EXPORT,
EXTENDED,EXTERNAL,FALSE,FETCH,FIELDS,FILE,FILEFORMAT,FIRST,FLOAT,FOLLOWING,FOR,FORMAT,
FORMATTED,FROM,FULL,FUNCTION,FUNCTIONS,GRANT,GROUP,GROUPING,HAVING,HOLD_DDLTIME,HOUR,
IDXPROPERTIES,IF,IGNORE,IMPORT,IN,INDEX,INDEXES,INNER,INPATH,INPUTDRIVER,INPUTFORMAT,
INSERT,INT,INTERSECT,INTERVAL,INTO,IS,ITEMS,JAR,JOIN,KEYS,KEY_TYPE,LATERAL,LEFT,LESS,
LIKE,LIMIT,LINES,LOAD,LOCAL,LOCATION,LOCK,LOCKS,LOGICAL,LONG,MACRO,MAP,MAPJOIN,
MATERIALIZED,MINUS,MINUTE,MONTH,MORE,MSCK,NONE,NOSCAN,NOT,NO_DROP,NULL,OF,OFFLINE,ON,
OPTION,OR,ORDER,OUT,OUTER,OUTPUTDRIVER,OUTPUTFORMAT,OVER,OVERWRITE,OWNER,PARTIALSCAN,
PARTITION,PARTITIONED,PARTITIONS,PERCENT,PLUS,PRECEDING,PRESERVE,PRETTY,PRINCIPALS,
PROCEDURE,PROTECTION,PURGE,RANGE,READ,READONLY,READS,REBUILD,RECORDREADER,RECORDWRITER,
REDUCE,REGEXP,RELOAD,RENAME,REPAIR,REPLACE,RESTRICT,REVOKE,REWRITE,RIGHT,RLIKE,ROLE,ROLES,
ROLLUP,ROW,ROWS,SCHEMA,SCHEMAS,SECOND,SELECT,SEMI,SERDE,SERDEPROPERTIES,SERVER,SET,SETS,
SHARED,SHOW,SHOW_DATABASE,SKEWED,SMALLINT,SORT,SORTED,SSL,STATISTICS,STORED,STREAMTABLE,
STRING,STRUCT,TABLE,TABLES,TABLESAMPLE,TBLPROPERTIES,TEMPORARY,TERMINATED,THEN,TIMESTAMP,
TINYINT,TO,TOUCH,TRANSACTIONS,TRANSFORM,TRIGGER,TRUE,TRUNCATE,UNARCHIVE,UNBOUNDED,UNDO,
UNION,UNIONTYPE,UNIQUEJOIN,UNLOCK,UNSET,UNSIGNED,UPDATE,URI,USE,USER,USING,UTC,
UTCTIMESTAMP,VALUES,VALUE_TYPE,VARCHAR,VIEW,WHEN,WHERE,WHILE,WINDOW,WITH,YEAR

Non-reserved Keywords

ADD,ADMIN,AFTER,ANALYZE,ARCHIVE,ASC,BEFORE,BUCKET,BUCKETS,CASCADE,CHANGE,CLUSTER,
CLUSTERED,CLUSTERSTATUS,COLLECTION,COLUMNS,COMMENT,COMPACT,COMPACTIONS,COMPUTE,
CONCATENATE,CONTINUE,DATA,DATABASES,DATETIME,DAY,DBPROPERTIES,DEFERRED,DEFINED,
DELIMITED,DEPENDENCY,DESC,DIRECTORIES,DIRECTORY,DISABLE,DISTRIBUTE,ELEM_TYPE,ENABLE,
ESCAPED,EXCLUSIVE,EXPLAIN,EXPORT,FIELDS,FILE,FILEFORMAT,FIRST,FORMAT,FORMATTED,
FUNCTIONS,HOLD_DDLTIME,HOUR,IDXPROPERTIES,IGNORE,INDEX,INDEXES,INPATH,INPUTDRIVER,
INPUTFORMAT,ITEMS,JAR,KEYS,KEY_TYPE,LIMIT,LINES,LOAD,LOCATION,LOCK,LOCKS,LOGICAL,LONG,
MAPJOIN,MATERIALIZED,MINUS,MINUTE,MONTH,MSCK,NOSCAN,NO_DROP,OFFLINE,OPTION,OUTPUTDRIVER,
OUTPUTFORMAT,OVERWRITE,OWNER,PARTITIONED,PARTITIONS,PLUS,PRETTY,PRINCIPALS,PROTECTION,
PURGE,READ,READONLY,REBUILD,RECORDREADER,RECORDWRITER,REGEXP (Hive 0.x.x and 1.x.x),
RELOAD,RENAME,REPAIR,REPLACE,RESTRICT,REWRITE,RLIKE (Hive 0.x.x and 1.x.x),ROLE,ROLES,
SCHEMA,SCHEMAS,SECOND,SEMI,SERDE,SERDEPROPERTIES,SERVER,SETS,SHARED,SHOW,SHOW_DATABASE,
SKEWED,SORT,SORTED,SSL,STATISTICS,STORED,STREAMTABLE,STRING,STRUCT,TABLES,TBLPROPERTIES,
TEMPORARY,TERMINATED,TINYINT,TOUCH,TRANSACTIONS,UNARCHIVE,UNDO,UNIONTYPE,UNLOCK,UNSET,
UNSIGNED,URI,USE,UTC,UTCTIMESTAMP,VALUE_TYPE,VIEW,WHILE,YEAR

Version information

REGEXP and RLIKE are non-reserved keywords prior to Hive 2.0.0 and reserved keywords starting in Hive 2.0.0 (HIVE-11703).

Reserved Keywords

ALL,ALTER,AND,ARRAY,AS,AUTHORIZATION,BETWEEN,BIGINT,BINARY,BOOLEAN,BOTH,BY,CASE,CAST,
CHAR,COLUMN,CONF,CREATE,CROSS,CUBE,CURRENT,CURRENT_DATE,CURRENT_TIMESTAMP,CURSOR,
DATABASE,DATE,DECIMAL,DELETE,DESCRIBE,DISTINCT,DOUBLE,DROP,ELSE,END,EXCHANGE,EXISTS,
EXTENDED,EXTERNAL,FALSE,FETCH,FLOAT,FOLLOWING,FOR,FROM,FULL,FUNCTION,GRANT,GROUP,
GROUPING,HAVING,IF,IMPORT,IN,INNER,INSERT,INT,INTERSECT,INTERVAL,INTO,IS,JOIN,LATERAL,
LEFT,LESS,LIKE,LOCAL,MACRO,MAP,MORE,NONE,NOT,NULL,OF,ON,OR,ORDER,OUT,OUTER,OVER,
PARTIALSCAN,PARTITION,PERCENT,PRECEDING,PRESERVE,PROCEDURE,RANGE,READS,REDUCE,
REGEXP (Hive 2.0.0 onward),REVOKE,RIGHT,RLIKE (Hive 2.0.0 onward),ROLLUP,ROW,ROWS,
SELECT,SET,SMALLINT,TABLE,TABLESAMPLE,THEN,TIMESTAMP,TO,TRANSFORM,TRIGGER,TRUE,
TRUNCATE,UNBOUNDED,UNION,UNIQUEJOIN,UPDATE,USER,USING,VALUES,VARCHAR,WHEN,WHERE,
WINDOW,WITH

Reserved keywords are permitted as identifiers if you quote them as described in Supporting Quoted Identifiers in Column Names (version 0.13.0 and later, see HIVE-6013). Most of the keywords are reserved through HIVE-6617 in order to reduce the ambiguity in grammar (version 1.2.0 and later). There are two ways if the user still would like to use those reserved keywords as identifiers: (1) use quoted identifiers, (2) set hive.support.sql11.reserved.keywords=false.

### Hive正则表达式字符串校验方法 在Hive中,可以使用`RLIKE`或`REGEXP`关键字来实现基于正则表达式的字符串校验功能。这些操作符能够帮助用户判断某个字段是否符合特定的正则表达式模式,并返回布尔值(TRUE/FALSE),从而方便地完成数据筛选和验证。 #### 1. `RLIKE`/`REGEXP` 的基本语法 以下是`RLIKE`的操作符语法: ```sql SELECT * FROM table_name WHERE column_name RLIKE 'pattern'; ``` 或者也可以写作: ```sql SELECT * FROM table_name WHERE column_name REGEXP 'pattern'; ``` 其中: - **table_name**: 数据表名称。 - **column_name**: 需要进行正则匹配的列名。 - **pattern**: 表达式模式,遵循Java标准的正则表达式规则[^3]。 例如,假设有一个名为`users`的数据表,其结构如下: | id | name | |----|----------| | 1 | Alice | | 2 | Bob | | 3 | Charlie_ | 如果希望查询名字以字母"A"开头的所有记录,则可执行以下SQL语句: ```sql SELECT * FROM users WHERE name RLIKE '^A.*'; ``` 此查询会返回所有满足条件的结果,即只包含一行:"Alice"[^3]。 #### 2. 使用`regexp()`函数 除了通过`RLIKE`或`REGEXP`作为WHERE子句的一部分外,还可以调用内置UDF——`regexp()`来进行更灵活的处理。该函数接受两个参数并返回布尔类型的判定结果。 具体形式为: ```sql regexp(string source, string pattern) ``` 实例演示: 假设有另一张表格叫做`logs`,存储日志消息及其时间戳信息。现在想找出那些含有错误标志的日志条目,可以用下面这段脚本做到这一点: ```sql SELECT timestamp, message FROM logs WHERE regexp(message, 'ERROR'); ``` 这将会提取出任何包含单词“ERROR”的行项[^1]。 #### 3. 替换不符合预期的内容 - `regexp_replace()` 当不仅仅需要检测是否存在某种模式,而且还需要修正原始输入时,那么就可以考虑采用`regexp_replace()`这个工具了。它可以按照指定的方式把符合条件的部分替换成新的片段。 定义方式如下所示: ```sql regexp_replace(string A, string B, string C) ``` 实际应用案例展示: 设想存在一张产品清单列表`products`,里面有一部分商品编号前缀被误写成了'prod_'而非正确的'product_'。为了统一格式化所有的项目编码,应该这样编写修复命令: ```sql UPDATE products SET product_id = regexp_replace(product_id,'^prod_', 'product_') ; ``` 当然,在某些版本里可能不支持直接update操作,所以通常我们会创建视图或者是重新导入修改后的文件到新位置去保存更改效果[^2]。 ### 总结 综上所述,无论是简单的真伪测试还是复杂的文本转换需求,Hive都提供了丰富的选项供开发者选择适合自己的解决方案路径。利用好这几个围绕着正则表达式的强大特性,可以让数据分析变得更加精准有效率!
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值