开源数据库PostgreSQL攻克并行计算难题

本文介绍了PostgreSQL中并行计算功能的应用与优化技巧,通过实际案例展示了不同并行度下性能的变化情况,揭示了并行度并非越高越好,需要根据具体场景调整。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

经过多年的酝酿(从支持work process到支持动态fork共享内存,再到内核层面支持并行计算),PostgreSQL 的并行计算功能终于来了,为PG的scale up能力再次拔高一个台阶,标志着开源数据库已经攻克了并行计算的难题。


相信有很多小伙伴已经开始测试了,我也测试了一个场景是标签系统类应用的比特位运算,昨天测试发现性能相比非并行已经提升了7倍。

调整并行度,在32个核的虚拟机上测试,性能提升了约10多倍。
但是实际上并没有到32倍,不考虑内存和IO的瓶颈,是有优化空间。
注意不同的并行度,效果不一样,目前来看并不是最大并行度就能发挥最好的性能,还需要考虑锁竞争的问题。
把测试表的数据量加载到16亿,共90GB。

postgres=# \dt+
                    List of relations
 Schema |  Name  | Type  |  Owner   | Size  | Description 
--------+--------+-------+----------+-------+-------------
 public | t_bit2 | table | postgres | 90 GB | 
(1 row)

不使用并行的性能如下,耗时 141377.100 毫秒。

postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 ms
postgres=# select count(*) from t_bit2 ;
   count    
------------
 1600000000
(1 row)
Time: 141377.100 ms

使用17个并行,获得了最好的性能, 耗时9423.257 毫秒。

postgres=# alter table t_bit2 set (parallel_degree=17);
ALTER TABLE
Time: 0.287 ms
postgres=# select count(*) from t_bit2 ;
   count    
------------
 1600000000
(1 row)

Time: 9423.257 ms

并行度为17时,每秒处理的数据量已经达到9.55GB。
与非并行相比,性能达到了15倍,基本上是线性的。
但是可能由于NUMA的原因(并行度增加时, 读数据操作可能会引入较多的__mutex_lock_slowpath, _spin_lock),并行度再加上来性能并不能再线性提升,而是会往下走。
_
另一组测试数据,加入了BIT计算。
32个并行度时,可以获得最好的性能提升,同样也和NUMA有关,为什么并行度能更高呢,因为计算量更大了,扫描冲突可以分担掉。
同样性能比达到了30.9倍,也基本上是线性的。
_

postgres=# alter table t_bit2 set (parallel_degree=32);
ALTER TABLE
Time: 0.341 ms
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count    
------------
 1600000000
(1 row)

Time: 15836.064 ms
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.368 ms
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count    
------------
 1600000000
(1 row)

Time: 488459.158 ms
postgres=# select 488459.158 /15826.358;
      ?column?       
---------------------
 30.8636489835501004
(1 row)

Time: 2.919 ms

后面会再提供tpc-h的测试数据。


那么如何设置并行度呢?决定并行度的几个参数如下
.1. 最大允许的并行度
max_parallel_degree


.2. 表设置的并行度(create table或alter table设置)
parallel_degree
如果设置了表的并行度,则最终并行度取min(max_parallel_degree , parallel_degree )

                /*
                 * Use the table parallel_degree, but don't go further than
                 * max_parallel_degree.
                 */
                parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);


.3. 如果表没有设置并行度parallel_degree ,则根据表的大小 和 parallel_threshold 这个硬编码值决定,计算得出(见函数create_plain_partial_paths)
然后依旧受到max_parallel_degree 参数的限制,不能大于它。
代码如下

src/backend/optimizer/util/plancat.c
void
get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
                                  RelOptInfo *rel)
{
...
        /* Retrive the parallel_degree reloption, if set. */
        rel->rel_parallel_degree = RelationGetParallelDegree(relation, -1);
...


src/include/utils/rel.h
/*
 * RelationGetParallelDegree
 *              Returns the relation's parallel_degree.  Note multiple eval of argument!
 */
#define RelationGetParallelDegree(relation, defaultpd) \
        ((relation)->rd_options ? \
         ((StdRdOptions *) (relation)->rd_options)->parallel_degree : (defaultpd))


src/backend/optimizer/path/allpaths.c
/*
 * create_plain_partial_paths
 *        Build partial access paths for parallel scan of a plain relation
 */
static void
create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
{
        int                     parallel_degree = 1;

        /*
         * If the user has set the parallel_degree reloption, we decide what to do
         * based on the value of that option.  Otherwise, we estimate a value.
         */
        if (rel->rel_parallel_degree != -1)
        {
                /*
                 * If parallel_degree = 0 is set for this relation, bail out.  The
                 * user does not want a parallel path for this relation.
                 */
                if (rel->rel_parallel_degree == 0)
                        return;

                /*
                 * Use the table parallel_degree, but don't go further than
                 * max_parallel_degree.
                 */
                parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);
        }
        else
        {
                int                     parallel_threshold = 1000;

                /*
                 * If this relation is too small to be worth a parallel scan, just
                 * return without doing anything ... unless it's an inheritance child.
                 * In that case, we want to generate a parallel path here anyway.  It
                 * might not be worthwhile just for this relation, but when combined
                 * with all of its inheritance siblings it may well pay off.
                 */
                if (rel->pages < parallel_threshold &&
                        rel->reloptkind == RELOPT_BASEREL)
                        return;
// 表级并行度没有设置时,通过表的大小和parallel_threshold 计算并行度  
                /*
                 * Limit the degree of parallelism logarithmically based on the size
                 * of the relation.  This probably needs to be a good deal more
                 * sophisticated, but we need something here for now.
                 */
                while (rel->pages > parallel_threshold * 3 &&
                           parallel_degree < max_parallel_degree)
                {
                        parallel_degree++;
                        parallel_threshold *= 3;
                        if (parallel_threshold >= PG_INT32_MAX / 3)
                                break;
                }
        }

        /* Add an unordered partial path based on a parallel sequential scan. */
        add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_degree));
}


其他测试数据:

增加到32个并行,和硬件有关,并不一定是并行度最高时,性能就最好,前面已经分析了,一定要找到每个查询的拐点。  
postgres=# alter table t_bit2 set (parallel_degree =32);

postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
                                                                                                                                                                                                                                        QUERY
 PLAN                                                                                                                                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1551053.25..1551053.26 rows=1 width=8) (actual time=31092.551..31092.552 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=1473213
   ->  Gather  (cost=1551049.96..1551053.17 rows=32 width=8) (actual time=31060.939..31092.469 rows=33 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 32
         Workers Launched: 32
         Buffers: shared hit=1473213
         ->  Partial Aggregate  (cost=1550049.96..1550049.97 rows=1 width=8) (actual time=31047.074..31047.075 rows=1 loops=33)
               Output: PARTIAL count(*)
               Buffers: shared hit=1470589
               Worker 0: actual time=31037.287..31037.288 rows=1 loops=1
                 Buffers: shared hit=43483
               Worker 1: actual time=31035.803..31035.804 rows=1 loops=1
                 Buffers: shared hit=45112
......
               Worker 31: actual time=31055.871..31055.876 rows=1 loops=1
                 Buffers: shared hit=46439
               ->  Parallel Seq Scan on public.t_bit2  (cost=0.00..1549983.80 rows=26465 width=0) (actual time=0.040..17244.827 rows=6060606 loops=33)
                     Output: id
                     Filter: (bitand(t_bit2.id, B'1010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101
0101010101010'::"bit") = B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010'::"bit")
                     Buffers: shared hit=1470589
                     Worker 0: actual time=0.035..17314.296 rows=5913688 loops=1
                       Buffers: shared hit=43483
                     Worker 1: actual time=0.030..16965.158 rows=6135232 loops=1
                       Buffers: shared hit=45112
......
                     Worker 31: actual time=0.031..17580.908 rows=6315704 loops=1
                       Buffers: shared hit=46439
 Planning time: 0.354 ms
 Execution time: 31157.006 ms
(145 rows)

比特位运算  
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count   
-----------
 200000000
(1 row)
Time: 4320.931 ms

COUNT  
postgres=# select count(*) from t_bit2;
   count   
-----------
 200000000
(1 row)
Time: 1896.647 ms

关闭并行的查询效率    
postgres=# set force_parallel_mode =off;
SET
postgres=# alter table t_bit2 set (parallel_degree =0);
ALTER TABLE
postgres=# \timing
Timing is on.
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count   
-----------
 200000000
(1 row)
Time: 53098.480 ms
postgres=# select count(*) from t_bit2;
   count   
-----------
 200000000
(1 row)
Time: 18504.679 ms

表大小  
postgres=# \dt+ t_bit2
                    List of relations
 Schema |  Name  | Type  |  Owner   | Size  | Description 
--------+--------+-------+----------+-------+-------------
 public | t_bit2 | table | postgres | 11 GB | 
(1 row)


参考信息
http://www.postgresql.org/docs/9.6/static/sql-createtable.html

parallel_degree (integer)
The parallel degree for a table is the number of workers that should be used to assist a parallel scan of that table. If not set, the system will determine a value based on the relation size. The actual number of workers chosen by the planner may be less, for example due to the setting of max_parallel_degree.

http://www.postgresql.org/docs/9.6/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER

force_parallel_mode (enum)
Allows the use of parallel queries for testing purposes even in cases where no performance benefit is expected. The allowed values of force_parallel_mode are off (use parallel mode only when it is expected to improve performance), on (force parallel query for all queries for which it is thought to be safe), and regress (like on, but with additional behavior changes as explained below).

More specifically, setting this value to on will add a Gather node to the top of any query plan for which this appears to be safe, so that the query runs inside of a parallel worker. Even when a parallel worker is not available or cannot be used, operations such as starting a subtransaction that would be prohibited in a parallel query context will be prohibited unless the planner believes that this will cause the query to fail. If failures or unexpected results occur when this option is set, some functions used by the query may need to be marked PARALLEL UNSAFE (or, possibly, PARALLEL RESTRICTED).

Setting this value to regress has all of the same effects as setting it to on plus some additional effects that are intended to facilitate automated regression testing. Normally, messages from a parallel worker include a context line indicating that, but a setting of regress suppresses this line so that the output is the same as in non-parallel execution. Also, the Gather nodes added to plans by this setting are hidden in EXPLAIN output so that the output matches what would be obtained if this setting were turned off.

http://www.postgresql.org/docs/9.6/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR

max_parallel_degree (integer)
Sets the maximum number of workers that can be started for an individual parallel operation. Parallel workers are taken from the pool of processes established by max_worker_processes. Note that the requested number of workers may not actually be available at runtime. If this occurs, the plan will run with fewer workers than expected, which may be inefficient. The default value is 2. Setting this value to 0 disables parallel query execution.

http://www.postgresql.org/docs/9.6/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

parallel_setup_cost (floating point)
Sets the planner's estimate of the cost of launching parallel worker processes. The default is 1000.
parallel_tuple_cost (floating point)
Sets the planner's estimate of the cost of transferring one tuple from a parallel worker process to another process. The default is 0.1.
<think>好的,我现在需要仔细分析用户提供的这段代码,并逐行解释其功能。首先,这段代码看起来像是用于颗粒系统离散元模拟的,可能是使用像PFC(Particle Flow Code)这样的软件。代码中包含多个接触模型和属性的设置,以及一些模型求解和保存的指令。我需要一步步理清每个命令的作用。 首先,第一行是`model restore 'unbonded'`。这可能是指从之前保存的一个名为“unbonded”的模型状态恢复,可能意味着之前的模型中颗粒之间没有粘结,现在要在此基础上进行设置。接下来,`contact model linearpbond range contact type 'ball-ball'`,这里设置接触模型为linearpbond,可能是指线性并行粘结模型,作用于类型为'ball-ball'的接触。这可能是为了在球体间建立某种粘结接触模型。 然后是`contact method bond gap 2.0e-4`,这里设置粘结方法中的间隙为2.0e-4,可能用来确定何时在接触点间建立粘结,当间隙小于该值时才会形成粘结。接下来两处设置使用`contact method deform`和`contact method pb_deform`,都指定了emod和krat参数,可能分别对应弹性模量和刚度比。这里emod是弹性模量1.4e9,krat可能是刚度比1.5,可能用于定义接触的变形行为,比如法向和切向刚度的比例。 然后设置粘结强度,`contact property pb_ten 11.0e6 pb_coh 47.8e6 pb_fa 15.0`,这里pb_ten是抗拉强度,pb_coh是粘结凝聚力,pb_fa可能是摩擦角?或者某种角度参数?需要确认这些参数的具体含义。接下来设置阻尼,`contact property dp_nratio 0.47`,dp_nratio可能是指法向阻尼比率,用于能量耗散。 之后设置摩擦系数,`contact property fric 0.46 range contact type 'ball-ball'`,在ball-ball接触类型中设置摩擦系数为0.46,这影响接触时的切向力。然后是将球的位移归零,`ball attribute displacement multiply 0.0`,可能是在初始化或重置位移,确保模拟开始时没有位移累积。 接下来是`contact property lin_force 0.0 0.0 0.0 lin_mode 1`,这行可能需要进一步理解。lin_force可能是指线性力,设为0,lin_mode设置为1可能表示某种力的模式,比如覆盖或累加。然后重置接触力和力矩为0,`ball attribute force-contact multiply 0.0 moment-contact multiply 0.0`,这可能是为了清除之前的接触力和力矩,开始新的计算。 之后运行`model cycle 1`,进行一次计算循环,可能用于初始化或应用之前的设置。最后,`model solve ratio-average 1e-4`,开始求解直到达到平均力比收敛准则1e-4,保证系统平衡。完成后保存模型为'parallel_bonded'。 在分析过程中,需要注意各个命令的作用域,比如`range contact type 'ball-ball'`可能限定某些设置只适用于特定类型的接触。同时,不同接触模型(如linearpbond)对应的参数可能有所不同,需确认每个参数的正确含义。例如,pb_fa可能不是摩擦角,而是平行粘结的摩擦角度或其他参数。此外,lin_mode的设置可能需要查阅相关文档确认其具体作用,比如是重置力还是保持力的累加。 可能存在的疑问点:linearpbond模型的具体参数是否包括pb_ten、pb_coh等?还是这些属于其他粘结模型?例如,PFC中的线性接触和并行粘结接触可能有不同的属性设置。例如,linearpbond可能同时包含线性和粘结部分,或者是否是用户自定义的模型?需要确保参数与模型对应正确。此外,deform和pb_deploy的设置是否有重复或冲突?可能一个是设置接触的弹性参数,另一个设置粘结部分的参数,需要区分。 总结来说,这段代码的流程是恢复一个无粘结的模型,设置粘结接触模型及其参数(刚度、强度、阻尼、摩擦),重置位移和力,进行一次迭代,求解至平衡,并保存粘结后的模型。这可能是模拟颗粒材料从无粘结状态到粘结状态的过程,用于研究其力学行为。</think>我将作为资深开发工程师,逐行分析您提供的代码。这段代码似乎是用于离散元模拟软件(如PFC3D)的脚本,主要用于设置颗粒间的接触模型、力学参数和求解过程。以下为分步解析: --- ### **1. 模型恢复与初始化** ```python model restore 'unbonded' ``` - **功能**:从名为 `unbonded` 的保存文件中恢复模型状态。 - **解析**:假设此前已创建了一个未绑定粘结的颗粒系统,此操作将重置模型到该初始状态。 --- ### **2. 接触模型设置** ```python contact model linearpbond range contact type 'ball-ball' ``` - **功能**:对`ball-ball`类型的接触应用线性并行粘结模型(`linearpbond`)。 - **参数**: - `linearpbond`:结合线性弹性接触和并行粘结(类似胶结)的复合模型。 - `range contact type 'ball-ball'`:仅作用于球体-球体接触。 --- ### **3. 粘结形成条件** ```python contact method bond gap 2.0e-4 ``` - **功能**:设置粘结形成的间隙阈值。 - **解析**:当两个球体间距 ≤ 2.0e-4 时,自动建立粘结。 --- ### **4. 刚度参数设置** ```python contact method deform emod 1.4e9 krat 1.5 contact method pb_deform emod 1.4e9 krat 1.5 ``` - **功能**:分别设置接触变形和并行粘结的弹性参数。 - **参数**: - `emod`:弹性模量(1.4 GPa),控制接触刚度。 - `krat`:刚度比(切向/法向刚度比 = 1.5)。 --- ### **5. 粘结强度参数** ```python contact property pb_ten 11.0e6 pb_coh 47.8e6 pb_fa 15.0 ``` - **功能**:定义并行粘结的强度参数。 - **参数**: - `pb_ten`:抗拉强度(11 MPa),超过此值粘结断裂。 - `pb_coh`:粘结凝聚力(47.8 MPa),抗剪强度。 - `pb_fa`:摩擦角(15度),影响剪切失效准则。 --- ### **6. 阻尼与摩擦** ```python contact property dp_nratio 0.47 contact property fric 0.46 range contact type 'ball-ball' ``` - **功能**: - `dp_nratio 0.47`:法向阻尼比,用于能量耗散。 - `fric 0.46`:设置`ball-ball`接触的摩擦系数(库仑摩擦)。 --- ### **7. 位移与力重置** ```python ball attribute displacement multiply 0.0 contact property lin_force 0.0 0.0 0.0 lin_mode 1 ball attribute force-contact multiply 0.0 moment-contact multiply 0.0 ``` - **功能**: - 清零所有颗粒的位移、接触力和力矩。 - `lin_mode 1`:强制覆盖原有线性力(而非累加)。 --- ### **8. 求解与保存** ```python model cycle 1 model solve ratio-average 1e-4 model save 'parallel_bonded' ``` - **功能**: - `cycle 1`:执行1个计算周期,初始化系统。 - `solve ratio-average 1e-4`:迭代求解直至平均力比收敛至1e-4(系统平衡)。 - 保存最终状态到`parallel_bonded`文件。 --- ### **总结:代码逻辑** 1. **初始化**:恢复无粘结模型,准备添加粘结。 2. **接触模型**:定义粘结条件、刚度、强度、摩擦和阻尼。 3. **系统重置**:清除历史位移和力,确保计算从纯净状态开始。 4. **平衡求解**:迭代至静力平衡,保存粘结后的系统。 此代码典型用于模拟颗粒材料(如岩石、陶瓷)的胶结行为,或研究从松散到固结的力学过程。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值