How to handling large volumes of data on PostgreSQL?

本文分享了一次在 PostgreSQL 数据库中处理近 50 亿条记录的经验,涉及数据加载、索引构建、数据迁移及删除等操作。文章讨论了在高性能服务器配置下进行大规模数据处理时遇到的问题,并提出了解决方案,包括表分区、使用 RAID0 或 RAID10、调整 PostgreSQL 参数等。

mailing list: pgsql-admin.postgresql.org

from: Johann Spies

..loaded about 4,900,000,000 in one of two tables with 7200684 in the second table in database ‘firewall’, built one index using one date-field (which took a few days) and used that index to copy about 3,800,000,000 of those records from the first to a third table, deleted those copied record from the first table and dropped the third table.
This took about a week on a 2xCPU quadcore server with 8Gb RAM..

Table paritioning is need.

distribute tables across different disks through tablespaces.Tweak the shared buffers and work_mem settings.

RAID5/6 are very,very slow when it comes to small disk *writes*.

At least a hardware RAID controller with RAID 0 or 10 should be used, with 10krpm or 15krpm drives. SAS preferred.

as on SATA the only quick disks are Western Digital Raptor.

look at a view called pg_stat_activity. Do: select * from pg_stat_activity;

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值