How to handling large volumes of data on PostgreSQL?

最新推荐文章于 2025-09-12 18:11:26 发布

joliny

最新推荐文章于 2025-09-12 18:11:26 发布

阅读量984

点赞数

分类专栏：技术空间文章标签： table database server disk list

技术空间专栏收录该内容

717 篇文章

订阅专栏

本文分享了一次在 PostgreSQL 数据库中处理近 50 亿条记录的经验，涉及数据加载、索引构建、数据迁移及删除等操作。文章讨论了在高性能服务器配置下进行大规模数据处理时遇到的问题，并提出了解决方案，包括表分区、使用 RAID0 或 RAID10、调整 PostgreSQL 参数等。

mailing list: pgsql-admin.postgresql.org

from: Johann Spies

..loaded about 4,900,000,000 in one of two tables with 7200684 in the second table in database ‘firewall’, built one index using one date-field (which took a few days) and used that index to copy about 3,800,000,000 of those records from the first to a third table, deleted those copied record from the first table and dropped the third table.
This took about a week on a 2xCPU quadcore server with 8Gb RAM..

—

Table paritioning is need.

distribute tables across different disks through tablespaces.Tweak the shared buffers and work_mem settings.

RAID5/6 are very,very slow when it comes to small disk *writes*.

At least a hardware RAID controller with RAID 0 or 10 should be used, with 10krpm or 15krpm drives. SAS preferred.

as on SATA the only quick disks are Western Digital Raptor.

look at a view called pg_stat_activity. Do: select * from pg_stat_activity;