POSTGRESQL ERROR

最新推荐文章于 2023-12-14 10:58:40 发布

转载最新推荐文章于 2023-12-14 10:58:40 发布 · 2.7k 阅读

postgresql 专栏收录该内容

8 篇文章

订阅专栏

本文详细解析了PostgreSQL中TOAST存储机制的工作原理，包括如何处理超长字段、不同存储策略的选择及其对性能的影响。同时，介绍了如何诊断和解决TOAST相关的常见问题。

missing chunk number x for toast value x in pg_toast_x

相关知识

toast是The OverSized Attribute Storage Technique(超尺寸字段存储技术)的缩写，是超长字段在pg中的一种存储方式。
pg采用的存储默认是每个页面存储固定8Kb大小的数据，并且元组(行记录)不允许跨页面存储，所以并不能直接存储大字段数据。
所以toast会将大字段值压缩或者分散为多个物理行来存储。
pg的部分类型数据支持toast，因为不是所有字段都会产生大字段数据的，完全没必要用到Toast技术(比如date,time,boolean等)。
支持Toast的数据类型应当时变长的(variable-length)。
当表中字段任何一个有Toast，那这个表都会有这一个相关联的Toast表。
OID被存储在pg_class.reltoastrelid里面。
超出的数值将会被分割成chunks，并最多toast_max_chunk_size 个byte(缺省是2Kb)，
当存储的行数据超过toast_tuple_threshold值(通常是2kB)，就会触发toast存储，这时toast将会压缩或者移动字段值直到超出部分比toast_tuple_targer值小(这个值通常也是2KB)。
相比较普通表(MAIN TABLE),TOAST有额外的三个字段

chunk_id：标识TOAST表的OID字段
chunk_seq：chunk的序列号，与chunk_id的组合唯一索引可以加速访问
chunk_data：存储TOAST表的实际数据
toast有4种存储策略：

PLAIN：避免压缩和行外存储。只有那些不需要TOAST策略就能存放的数据类型允许选择（例如int类型），而对于text这类要求存储长度超过页大小的类型，是不允许采用此策略的

EXTENDED：允许压缩和行外存储。一般会先压缩，如果还是太大，就会行外存储

EXTERNA：允许行外存储，但不许压缩。类似字符串这种会对数据的一部分进行操作的字段，采用此策略可能获得更高的性能，因为不需要读取出整行数据再解压。

MAIN：允许压缩，但不许行外存储。不过实际上，为了保证过大数据的存储，行外存储在其它方式（例如压缩）都无法满足需求的情况下，作为最后手段还是会被启动。因此理解为：尽量不使用行外存储更贴切。
举例说明
创建表

postgres=# create table blog(id int, title text, content text);
CREATE TABLE
postgres=# \d+ blog;
                          Table "public.blog"
 Column  |  Type   | Modifiers | Storage  | Stats target | Description 
---------+---------+-----------+----------+--------------+-------------
 id      | integer |           | plain    |              | 
 title   | text    |           | extended |              | 
 content | text    |           | extended |              |

可以看到，interger默认TOAST策略为plain，而text为extended。PG资料告诉我们，如果表中有字段需要TOAST，那么系统会自动创建一张TOAST表负责行外存储，那么这张表在哪里？

postgres=# select relname,relfilenode,reltoastrelid from pg_class where relname='blog';
 relname | relfilenode | reltoastrelid 
---------+-------------+---------------
 blog    |       16441 |         16444
(1 row)

通过上诉语句，我们查到blog表的oid为16441，其对应TOAST表的oid为16444（关于oid和pg_class的概念，请参考PG官方文档），那么其对应TOAST表名则为：pg_toast.pg_toast_16441（注意这里是blog表的oid），我们看下其定义：

postgres=# \d+ pg_toast.pg_toast_16441;
TOAST table "pg_toast.pg_toast_16441"
   Column   |  Type   | Storage 
------------+---------+---------
 chunk_id   | oid     | plain
 chunk_seq  | integer | plain
 chunk_data | bytea   | plain

TOAST表有3个字段：

chunk_id：用来表示特定TOAST值的OID，可以理解为具有同样chunk_id值的所有行组成原表（这里的blog）的TOAST字段的一行数据
chunk_seq：用来表示该行数据在整个数据中的位置
chunk_data：实际存储的数据。

现在我们来实际验证下:

postgres=# insert into blog values(1, 'title', '0123456789');
INSERT 0 1
postgres=# select * from blog;
 id | title |  content   
----+-------+------------
  1 | title | 0123456789
(1 row)

postgres=# select * from pg_toast.pg_toast_16441;
 chunk_id | chunk_seq | chunk_data 
----------+-----------+------------
(0 rows)

可以看到因为content只有10个字符，所以没有压缩，也没有行外存储。然后我们使用如下SQL语句增加content的长度，每次增长1倍，同时观察content的长度，看看会发生什么情况？

postgres=# update blog set content=content||content where id=1;
UPDATE 1
postgres=# select id,title,length(content) from blog;
 id | title | length 
----+-------+--------
  1 | title |     20
(1 row)

postgres=# select * from pg_toast.pg_toast_16441;
 chunk_id | chunk_seq | chunk_data 
----------+-----------+------------
(0 rows)

反复执行如上过程，直到pg_toast_16441表中有数据：

postgres=# select id,title,length(content) from blog;
 id | title | length 
----+-------+--------
  1 | title | 327680
(1 row)

postgres=# select chunk_id,chunk_seq,length(chunk_data) from pg_toast.pg_toast_16441;
 chunk_id | chunk_seq | length 
----------+-----------+--------
    16439 |         0 |   1996
    16439 |         1 |   1773
(2 rows)

可以看到，直到content的长度为327680时（已远远超过页大小8K），对应TOAST表中才有了2行数据，且长度都是略小于2K，这是因为extended策略下，先启用了压缩，然后才使用行外存储

下面我们将content的TOAST策略改为EXTERNA，以禁止压缩。

postgres=# alter table blog alter content set storage external;
ALTER TABLE

postgres=# \d+ blog;
                          Table "public.blog"
 Column  |  Type   | Modifiers | Storage  | Stats target | Description 
---------+---------+-----------+----------+--------------+-------------
 id      | integer |           | plain    |              | 
 title   | text    |           | extended |              | 
 content | text    |           | external |              |

然后我们再插入一条数据：

postgres=# insert into blog values(2, 'title', '0123456789');
INSERT 0 1
postgres=# select id,title,length(content) from blog;
 id | title | length 
----+-------+--------
  1 | title | 327680
  2 | title |     10
(2 rows)

然后重复以上步骤，直到TOAST表中产生新的行：

postgres=# update blog set content=content||content where id=2;
UPDATE 1
postgres=# select id,title,length(content) from blog;
 id | title | length 
----+-------+--------
  2 | title |   2560
  1 | title | 327680
(2 rows)

postgres=# select chunk_id,chunk_seq,length(chunk_data) from pg_toast.pg_toast_16441;
 chunk_id | chunk_seq | length 
----------+-----------+--------
    16447 |         0 |   1996
    16447 |         1 |   1773
    16448 |         0 |   1996
    16448 |         1 |    564
(4 rows)

这次我们看到当content长度达到2560（按照官方文档，应该是超过2KB左右），TOAST表中产生了新的2条chunk_id为16448的行，且2行数据的chunk_data的长度之和正好等于2560。

toast的优缺点
1.可以存储超长超大字段，避免之前不能直接存储的限制
2.物理上与普通表是分离的，检索查询时不检索到该字段会极大地加快速度
3.更新普通表时，该表的Toast数据没有被更新时，不用去更新Toast表
toast的劣势：
1.对大字段的索引创建是一个问题，有可能会失败，其实通常也不建议在大字段上创建，全文检索倒是一个解决方案。
2.大字段的更新会有点慢，其它DB也存在，通病

报错原因

某张表关联的toast表的data发生损坏。

解决方法

1、定位是哪张表的toast有问题：

select 2619::regclass; 
pg_statistic

2、找到哪个表有问题后，先对该表做一下简单的修复：

REINDEX table pg_toast.pg_toast_2619;
REINDEX table pg_statistic;
VACUUM ANALYZE pg_statistic;

3、定位该表中损坏的数据行。执行

DO $$
declare
	 v_rec record;
BEGIN	
	for v_rec in SELECT * FROM pg_statistic loop
	        raise notice 'Parameter is: %', v_rec.ctid;
		raise notice 'Parameter is: %', v_rec;
	end loop; 
END;
$$
  LANGUAGE plpgsql;

4、将第3步中定位的记录删除:

delete from pg_statistic where ctid ='(50,3)';

5、重复执行第3,4步，直到全部有问题的记录被清除。
6、至此，toast问题就解决完了，解决之后，对数据库进行一次完整的维护或者索引重建。

invalid page header in block x of relation base/x/x

报错原因

系统检测到磁盘页损坏，并导致postgresql数据取消当前的事务并提交一份错误报告信息。

解决方法

1、  关闭数据库服务器。
2、  编辑postgresql.conf文件，最后一行加入：zero_damaged_pages = on。保存文件，退出。
3、  启动数据库服务器，确认数据库服务是否恢复运行。
4、  关闭数据库服务器。
5、  编辑postgresql.conf文件，去掉最后一行zero_damaged_pages = on。保存文件，退出。
6、  重启数据库。

missing chunk number x for toast value x in pg_toast_x

报错原因

主键重复，导致重建索引报错。

解决方法

1、  根据报错信息定位问题记录。
2、  将问题记录拷贝出，并判断正确状态的数据行。
3、  将正确的记录拷贝会表里。
4、  对数据库重建索引。