hive中的事物操作

本文介绍了如何在Hive中启用并使用事务功能。通过修改配置文件及创建分桶表,实现了对Hive表的支持更新、插入和删除操作。示例展示了创建事务表的过程及如何执行更新操作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

背景

    hive中默认是不支持事物的,Hive的常见也一般都是一次写入,频繁读取。加入事物之后,支持update、insert、delete操作

  1. 保存的文件格式必须为crc格式file
  2. 需要修改配置文件开启事物
  3. 表必须分桶

    找到hive目录下conf文件夹中的hive-site.xml文件,需要修改的项和它对应的value值:

These configuration parameters must be set appropriately to turn on transaction support in Hive:

hive.support.concurrency – true
//这项不用修改
hive.enforce.bucketing – true (Not required as of Hive 2.0)
hive.exec.dynamic.partition.mode – nonstrict
hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on – true (for exactly one instance of the Thrift metastore service)
hive.compactor.worker.threads – a positive number on at least one instance of the Thrift metastore service
创建一个事物表
create table tx
(id int,name String,age int) 
clustered by (id) into 3 buckets 
stored as orc TBLPROPERTIES ('transactional'='true');

由于hive中load data 命令只能移动文件到hdfs相应的表目录下,不会对数据进行分桶操作,我们从t2表中导入数据到tx事物表,t2表中数据如下

hive> select * from t2;
OK
1	tom1	14
2	tom2	16
3	tom3	18
4	tom4	19
5	jenny	22
1	tom1	14
2	tom2	16
3	tom3	18
4	tom4	19
5	jenny	22
Time taken: 2.535 seconds, Fetched: 10 row(s)

    向tx中导入数据

hive> insert into tx select * from t2;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. 
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20180622103250_b82eb770-deeb-4ddd-b501-c8c00e6578ed
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
2018-06-22 10:32:55,488 Stage-1 map = 100%,  reduce = 0%
2018-06-22 10:32:56,499 Stage-1 map = 100%,  reduce = 67%
2018-06-22 10:32:57,524 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_local1300705513_0001
Loading data to table default.tx
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 408 HDFS Write: 4741 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 6.86 seconds

    执行更新操作,并再次查询tx表。发现已经更新了tx表中数据

update tx set name='tom3new' where id = 3;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. 
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20180622103525_a8545f32-982d-4f53-8d3f-cd142806a48d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
2018-06-22 10:35:26,809 Stage-1 map = 100%,  reduce = 0%
2018-06-22 10:35:27,825 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_local433693055_0002
Loading data to table default.tx
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 30256 HDFS Write: 16836 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 2.933 seconds
hive> select * from tx;
OK
3	tom3new	18
3	tom3new	18
4	tom4	19
1	tom1	14
4	tom4	19
1	tom1	14
5	jenny	22
2	tom2	16
5	jenny	22
2	tom2	16
Time taken: 0.101 seconds, Fetched: 10 row(s)



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值