作者:瀚高PG实验室 (Highgo PG Lab)- 波罗
数据分区的好处
1)分区后,单个分区表的索引和表都变小了,可以保持在内存里面,适合把热数据从大表拆分出来的场景。
2)对于大范围的查询,大表可以通过索引来避免全表扫描。但是如果分区了的话,可以使用分区的全表扫描。适合经常要做大范围扫描的场景,按照范围分区(分区后采用全表扫描),减少索引带来的随机BLOCK扫描。
3)大批量的数据导入或删除,对于大表来说,删除大量的数据使用DELETE的话会带来大量的VACUUM操作负担。而使用分区表的话可以直接DROP分区,或者脱离子表和父表的继承关系。
4)使用分区表,还有一个好处是,可以把不常用的分区放到便宜的存储上。
5)因为每个表只能放在一个表空间上,表空间和目录对应,表的大小受到表空间大小的限制,使用分区表则更加灵活。
数据库分区表举例
1)范围分区
根据字段存储的值的取值范围进行分区,例如日志表的时间字段,用户表的ID范围等等。
2)哈希分区
根据字段存储值HASH和分区数做比特运算得到一个唯一的分区ID。(对于版本的升级、软件的升级一定要确保哈希算法的一致)
或者取模也行
例如mod(hashtext(name),16),对16个分区的场景
3)list分区
与哈希分区类似,但是直接使用字段值作为分区条件。适合KEY值比较少并且比较均匀的场景。
例如按性别字段作为分区字段,那么就分成了2个区。
分区和表继承的概念
继承表自动继承父表的约束,非空约束。
但是不自动继承的是(uk,pk,fk,索引,存储参数等)。
示例:
postgres=# create table p1(id int primary key, info text unique, c1 int check(c1>0), c2 int not null, c3 int unique);
CREATE TABLE
postgres=# create table c1(like p1) inherits(p1);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "c1" with inherited definition
NOTICE: merging column "c2" with inherited definition
NOTICE: merging column "c3" with inherited definition
CREATE TABLE
postgres=# \d+ p1
Table "public.p1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------+---------+-------------+-----------+----------+-------------+----------------+-------------
id | integer | | not null | | plain | |
info | text | | | | extended | |
c1 | integer | | | | plain | |
c2 | integer | | not null | | plain | |
c3 | integer | | | | plain | |
Indexes:
"p1_pkey" PRIMARY KEY, btree (id)
"p1_c3_key" UNIQUE CONSTRAINT, btree (c3)
"p1_info_key" UNIQUE CONSTRAINT, btree (info)
Check constraints:
"p1_c1_check" CHECK (c1 > 0)
Child tables: c1
子表自动继承了父表的非空约束,自定义约束。未自动继承PK和UK。
postgres=# \d+ c1
Table "public.c1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------+----------+-------------+-----------+----------+------------+----------------+-------------
id | integer | | not null | | plain | |
info | text | | | | extended | |
c1 | integer | | | | plain | |
c2 | integer | | not null | | plain | |
c3 | integer | | | | plain | |
Check constraints:
"p1_c1_check" CHECK (c1 > 0)
Inherits: p1
如果要继承UK,PK,索引,存储结构等,创建时加including all;
postgres=# drop table c1;
DROP TABLE
postgres=# create table c1(like p1 including all) inherits(p1);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "c1" with inherited definition
NOTICE: merging column "c2" with inherited definition
NOTICE: merging column "c3" with inherited definition
NOTICE: merging constraint "p1_c1_check" with inherited definition
CREATE TABLE
postgres=# \d c1
Table "public.c1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
info | text | | |
c1 | integer | | |
c2 | integer | | not null |
c3 | integer | | |
Indexes:
"c1_pkey" PRIMARY KEY, btree (id)
"c1_c3_key" UNIQUE CONSTRAINT, btree (c3)
"c1_info_key" UNIQUE CONSTRAINT, btree (info)
Check constraints:
"p1_c1_check" CHECK (c1 > 0)
Inherits: p1
可以使用alter table解除和加入继承关系:
postgres=# alter table c1 no inherit p1;
ALTER TABLE
postgres=# alter table c1 drop constraint p1_c1_check;
ALTER TABLE
当主表和继承表的字段个数,顺序,名字,以及类型或者约束条件有任何不一致时,就无法自动添加继承。
这里指的是默认继承的约束,非空约束等。不包含不会默认继承的约束UK,PK,FK等。
postgres=# alter table c1 inherit p1;
ERROR: child table is missing constraint "p1_c1_check"
加上约束后就可以了:
postgres=# alter table c1 add constraint p1_c1_check check(c1>0);
ALTER TABLE
postgres=# alter table c1 inherit p1;
ALTER TABLE
一个表可以同时继承多个父表,一个父表可以被多个子表继承。
但是必须注意,一个表继承了多个主表的情况,共有字段上,所有的父表的约束包括not null的定义都必须继承过来(同样不包括pk,uk,fk等)。
查主表默认情况下是会连带查询所有的子表的,包括更深的子表(子表的子表)。
例如select * from p1;默认会查询所有子表和自身。
postgres=# explain select * from p1;
QUERY PLAN
-------------------------------------------------------------
Append (cost=0.00..20.70 rows=1071 width=48)
-> Seq Scan on p1 (cost=0.00..0.00 rows=1 width=48)
-> Seq Scan on c1 (cost=0.00..20.70 rows=1070 width=48)
(3 rows)
除非使用only或者修改sql_inheritance(默认为on,pg10已经删除此配置选项,从默认值更改此设置会导致引用父表的查询不包括子表。然而,SQL标准要求它们被包含在内)。
postgres=# explain select * from only p1;
QUERY PLAN
---------------------------------------------------
Seq Scan on p1 (cost=0.00..0.00 rows=1 width=48)
(1 row)
SELECT,UPDATE,DELETE,TRUNCATE,DROP命令在主表上操作默认都会影响到子表。
这些操作必须注意了,小心谨慎。
INSERT和COPY命令,这两条命令只是对当前表操作的,不会扩展到子表。
插入到指定分区的实现,使用触发器(由于insert、copy插入到父表,跟子表没有关系)
postgres=# create table p(id int, info text, crt_time timestamp);
CREATE TABLE
postgres=# create table c1(like p) inherits(p);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "crt_time" with inherited definition
CREATE TABLE
postgres=# create table c2(like p) inherits(p);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "crt_time" with inherited definition
CREATE TABLE
postgres=# create table c3(like p) inherits(p);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "crt_time" with inherited definition
CREATE TABLE
postgres=# create table c4(like p) inherits(p);
NOTICE: merging column "id" with inherited definition
NOTICE: merging column "info" with inherited definition
NOTICE: merging column "crt_time" with inherited definition
CREATE TABLE
postgres=# alter table c1 add constraint ck check (crt_time>='2018-04-04' and crt_time<'2018-05-04’);
ALTER TABLE
postgres=# alter table c2 add constraint ck check (crt_time>='2018-05-04' and crt_time<'2018-06-04’);
ALTER TABLE
postgres=# alter table c3 add constraint ck check (crt_time>='2018-06-04' and crt_time<'2018-07-04’);
ALTER TABLE
postgres=# alter table c4 add constraint ck check (crt_time>='2018-07-04' and crt_time<'2018-08-04’);
ALTER TABLE
postgres=# create or replace function ins_tg() returns trigger as $$
declare
begin
if NEW.crt_time>='2018-04-04'and NEW.crt_time<'2018-05-04' then
insert into c1(id, info, crt_time) values(NEW.*);
elseif NEW.crt_time>='2018-05-04' and NEW.crt_time<'2018-06-04' then
insert into c2(id, info, crt_time) values(NEW.*);
elseif NEW.crt_time>='2018-06-04' and NEW.crt_time<'2018-07-04' then
insert into c3(id, info, crt_time) values(NEW.*);
elseif NEW.crt_time>='2018-07-04' and NEW.crt_time<'2018-08-04' then
insert into c4(id, info, crt_time) values(NEW.*);
else
raise exception 'crt_time overflow.';
end if;
return null;
end;
$$language plpgsql strict;
CREATE FUNCTION
postgres=# create trigger tg1 before insert on p for each row execute procedure ins_tg();
CREATE TRIGGER
postgres=# insert into p(id, info, crt_time) values(1, 'test', now());
INSERT 0 0
postgres=# select * from c1;
id | info | crt_time
----+------+----------
(0 rows)
postgres=# select * from c2;
id | info | crt_time
----+------+----------------------------
1 | test | 2018-05-17 10:48:44.586036
(1 row)
postgres=# select * from c3;
id | info | crt_time
----+------+----------
(0 rows)
postgres=# select * from c4;
id | info | crt_time
----+------+----------
(0 rows)
postgres=# select * from only p;
id | info | crt_time
----+------+----------
(0 rows)
postgres=# insert into p(id, info, crt_time) values(1, 'test', '2018-06-04');
INSERT 0 0
postgres=# select * from c2;
id | info | crt_time
----+------+----------------------------
1 | test | 2018-05-17 10:48:44.586036
(1 row)