Greenplum merge insert 用法与性能 (insert on conflict)

标签

PostgreSQL , Greenplum , merge insert , insert on conflict , 合并插入 , 有则更新 , 无则插入


背景

PostgreSQL insert on conflict语法非常强大,支持合并写入(当违反某唯一约束时,冲突则更新,不冲突则写入),同时支持流式计算。

流计算例子链接:

《PostgreSQL 流式统计 - insert on conflict 实现 流式 UV(distinct), min, max, avg, sum, count ...》

《HTAP数据库 PostgreSQL 场景与性能测试之 22 - (OLTP) merge insert|upsert|insert on conflict|合并写入》

《PostgreSQL upsert功能(insert on conflict do)的用法》

《PostgreSQL 10.0 preview 功能增强 - 支持分区表ON CONFLICT .. DO NOTHING》

PostgreSQL insert on conflict语法如下:

Command:     INSERT    
Description: create new rows in a table    
Syntax:    
[ WITH [ RECURSIVE ] with_query [, ...] ]    
INSERT INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]    
    [ OVERRIDING { SYSTEM | USER} VALUE ]    
    { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }    
    [ ON CONFLICT [ conflict_target ] conflict_action ]    
    [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]    
    
where conflict_target can be one of:    
    
    ( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]    
    ON CONSTRAINT constraint_name    
    
and conflict_action is one of:    
    
    DO NOTHING    
    DO UPDATE SET { column_name = { expression | DEFAULT } |    
                    ( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...] ) |    
                    ( column_name [, ...] ) = ( sub-SELECT )    
                  } [, ...]    
              [ WHERE condition ]    

Greenplum的版本较低,还不支持insert on conflict的语法。

如果需要在Greenplum中实现类似的功能该如何操作?

DEMO

ID为PK,以它为合并列,举例。

1、目标表,也就是需要合并写入的目标:

create table t( id int primary key, c1 int , c2 int, c3 int, c4 int, c5 int, crt_time timestamp);    

2、中间表,也就是用户只管插入的表:

create table t_tmp(like t);    

写入一些中间记录。

insert into t_tmp values(1,1,2,3,null,null,now());    
insert into t_tmp values(1,1,2,4,null,null,now());    
insert into t_tmp values(1,1,2,3,null,7,now());    
insert into t_tmp values(1,1,null,3,5,6,now());    
postgres=# select * from t_tmp;    
 id | c1 | c2 | c3 | c4 | c5 |          crt_time              
----+----+----+----+----+----+----------------------------    
  1 |  1 |  2 |  3 |    |    | 2017-12-13 17:03:16.28482    
  1 |  1 |  2 |  4 |    |    | 2017-12-13 17:03:16.286302    
  1 |  1 |  2 |  3 |    |  7 | 2017-12-13 17:03:16.635121    
  1 |  1 |    |  3 |  5 |  6 | 2017-12-13 17:03:25.434191    
(4 rows)    

3、窗口合并,按唯一值约束,仅提取一条(可能存在窗口内合并的需求,例如按时间取最新,比如以最后一条为准,又或者以有值,且最新的为准)。

我有几张阿里云幸运券分享给你,用券购买或者升级阿里云相应产品会有特惠惊喜哦!把想要买的产品的幸运券都领走吧!快下手,马上就要抢光了。

以有值切最新为准例子:

select distinct on (id)     
  id,    
  first_value(c1) over (partition by id order by (case when c1 is null then null else crt_time end) desc nulls last) as c1,    
  first_value(c2) over (partition by id order by (case when c2 is null then null else crt_time end) desc nulls last) as c2,    
  first_value(c3) over (partition by id order by (case when c3 is null then null else crt_time end) desc nulls last) as c3,    
  first_value(c4) over (partition by id order by (case when c4 is null then null else crt_time end) desc nulls last) as c4,    
  first_value(c5) over (partition by id order by (case when c5 is null then null else crt_time end) desc nulls last) as c5,    
  first_value(crt_time) over (partition by id order by crt_time desc) as crt_time    
  from t_tmp ;    
    
 id | c1 | c2 | c3 | c4 | c5 |          crt_time              
----+----+----+----+----+----+----------------------------    
  1 |  1 |  2 |  3 |  5 |  6 | 2017-12-13 17:03:25.434191    
(1 row)    

存储中间结果:

create table t_tmp1 (like t) ;    
    
insert into t_tmp1     
select distinct on (id)     
  id,    
  first_value(c1) over (partition by id order by (case when c1 is null then null else crt_time end) desc nulls last) as c1,    
  first_value(c2) over (partition by id order by (case when c2 is null then null else crt_time end) desc nulls last) as c2,    
  first_value(c3) over (partition by id order by (case when c3 is null then null else crt_time end) desc nulls last) as c3,    
  first_value(c4) over (partition by id order by (case when c4 is null then null else crt_time end) desc nulls last) as c4,    
  first_value(c5) over (partition by id order by (case when c5 is null then null else crt_time end) desc nulls last) as c5,    
  first_value(crt_time) over (partition by id order by crt_time desc) as crt_time    
  from t_tmp ;    

4、合并写入:

将窗口提取的结果,合并写入目标表。

4.1、INNER JOIN,覆盖旧记录,同时补齐旧的字段(以NULL为判断条件。如果新的记录没有值,则取旧记录的值。)提取。

create table t_tmp2 (like t);    
    
insert into t_tmp2    
select     
t_tmp.id,     
coalesce(t_tmp.c1, t.c1),     
coalesce(t_tmp.c2, t.c2),     
coalesce(t_tmp.c3, t.c3),     
coalesce(t_tmp.c4, t.c4),     
coalesce(t_tmp.c5, t.c5),     
coalesce(t_tmp.crt_time, t.crt_time)     
from    
t_tmp1 as t_tmp    
inner join     
t    
using (id);    

4.2、DELETE USING,删除全量表的符合条件的记录。

delete from t using t_tmp2 where t.id=t_tmp2.id;    

4.3、INSERT

insert into t    
select t_tmp1.* from t_tmp1 left join t_tmp2 using (id) where t_tmp2.* is null    
union all    
select * from t_tmp2;    
原文链接
Greenplum数据库中没有直接支持MERGE INTO语句的功能。然而,你可以使用其他方法来实现类似的效果。一种常见的方法是使用INSERT INTO和UPDATE语句结合使用。 首先,你可以使用INSERT INTO语句将数据插入到目标表中。如果目标表中已经存在相同的记录,则会引发唯一键冲突错误。在这种情况下,你可以使用UPDATE语句执行更新操作。 以下是一个示例,演示如何在Greenplum数据库中模拟MERGE INTO语句的功能: ```sql -- 创建目标表 CREATE TABLE target_table ( id INT PRIMARY KEY, value VARCHAR(255) ); -- 创建临时表存储要插入的数据 CREATE TEMPORARY TABLE temp_table ( id INT, value VARCHAR(255) ); -- 插入数据到临时表 INSERT INTO temp_table (id, value) VALUES (1, 'Value 1'), (2, 'Value 2'), (3, 'Value 3'); -- 将临时表中的数据插入到目标表中 INSERT INTO target_table (id, value) SELECT id, value FROM temp_table ON CONFLICT (id) DO UPDATE SET value = excluded.value; -- 删除临时表 DROP TABLE temp_table; ``` 在上面的示例中,我们首先创建了目标表和临时表。然后,我们将要插入的数据插入到临时表中。最后,我们使用INSERT INTO和SELECT语句将临时表中的数据插入到目标表中。如果插入过程中发生唯一键冲突,我们使用ON CONFLICT子句执行更新操作。 请注意,这只是一种模拟MERGE INTO语句的方法,并不是Greenplum数据库的官方支持。你可以根据自己的需求和具体的数据操作场景进行调整和优化。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值