2023-03-29 duckdb-物理计划执行-Pipeline和TaskScheduler

DuckDB的物理计划执行基于Pipeline和TaskScheduler,实现多线程并行处理。Pipeline是操作符执行的核心,通过PipelineBuildState、PipelineExecutor等协调工作。TaskScheduler负责任务调度,通过ScheduleTask等方法管理并发执行,确保查询的高效运行。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

摘要:

duckdb的物理计划的执行的架构设计很出色, 每个物理计划的算子可以多线程并行的处理.

处于核心地位的便是Pipeline和TaskSchedule, 可以将其理解为多线程模型中的生产者与消费者模式, 但是和物理计划执行组合起来, 就能体会到这样设计的简洁和优雅.

参考:

https://www.youtube.com/watch?v=MA0OsvYFGrc

https://dsdsd.da.cwi.nl/slides/dsdsd-duckdb-push-based-execution.pdf

[DuckDB] Push-Based Execution Model - 知乎

duckdb中操作符的执行

在DuckDB中,操作符(Operator)是用来执行实际的查询计划的组件。当一个查询被编译并生成一个查询计划时,该计划会被转换成一个由一系列操作符组成的有向无环图(DAG),每个操作符都表示一种特定的操作,例如扫描表、应用筛选条件、排序等,操作符之间通过数据流相连。

操作符的执行是通过TaskScheduler(任务调度器)来实现的。TaskScheduler是DuckDB中的一个并发执行引擎,它负责管理一组可并行执行的任务(Task),并利用系统上的多个CPU核心来执行这些任务。当一个查询被提交执行时,TaskScheduler会根据查询计划中的操作符建立一个任务调度图,并根据优化器的分析确定每个任务的执行顺序和并行度。然后,TaskScheduler会分配任务给空闲的CPU核心,并等待它们完成。

在DuckDB的代码实现中,Operator和TaskScheduler的交互主要是通过虚函数(Virtual Function)实现的。Operator是一个抽象类,定义了一组纯虚函数(Pure Virtual Function),例如Execute等。这些函数是具体操作符子类要实现的,用于执行和管理操作符的状态。TaskScheduler则提供了一组接口(Interface)和抽象类,例如Task和TaskScheduler等,用于管理任务调度和并发执行。具体执行过程中,TaskScheduler会根据操作符的类型和属性,创建相应的Task,并为它们设置相关的执行参数(例如并行度和任务优先级),然后调用对应操作符的Execute函数,将控制权转移到执行函数内部,让操作符执行具体的操作逻辑。如果操作符需要读写资源,TaskScheduler会根据具体的资源访问规则(例如基于锁或信号量等的方式)控制并发访问。

Pipeline

核心处理

PipelineBuildState::AddPipelineOperator

#0  duckdb::PipelineBuildState::AddPipelineOperator (this=0x7ffc42d9a060, pipeline=..., op=0x61800001d880) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/parallel/pipeline.cpp:251
#1  0x0000000005dd78c5 in duckdb::PhysicalJoin::BuildJoinPipelines (current=..., meta_pipeline=..., op=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/execution/operator/join/physical_join.cpp:35
#2  0x0000000005dd88f2 in duckdb::PhysicalJoin::BuildPipelines (this=0x61800001d880, current=..., meta_pipeline=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/execution/operator/join/physical_join.cpp:84
#3  0x00000000037ba2dd in duckdb::MetaPipeline::Build (this=0x61100017a100, op=0x61800001d880) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/parallel/meta_pipeline.cpp:74
#4  0x0000000005d54984 in duckdb::PhysicalResultCollector::BuildPipelines (this=0x611000179e80, current=..., meta_pipeline=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/execution/operator/helper/physical_result_collector.cpp:56
#5  0x00000000037ba2dd in duckdb::MetaPipeline::Build (this=0x611000179fd0, op=0x611000179e80) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/parallel/meta_pipeline.cpp:74
#6  0x00000000037c7852 in duckdb::Executor::InitializeInternal (this=0x612000099940, plan=0x611000179e80) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/parallel/executor.cpp:306
#7  0x00000000037c6f7a in duckdb::Executor::Initialize (this=0x612000099940, physical_plan=std::unique_ptr<duckdb::PhysicalOperator> = {...})
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/parallel/executor.cpp:284
#8  0x00000000033c3c48 in duckdb::ClientContext::PendingPreparedStatement (this=0x615000011d90, lock=..., 
    statement_p=std::shared_ptr<duckdb::PreparedStatementData> (use count 2, weak count 0) = {...}, parameters=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/main/client_context.cpp:419
#9  0x00000000033ceab4 in duckdb::ClientContext::PendingStatementOrPreparedStatement (this=0x615000011d90, lock=..., 
    query="SELECT     *     FROM     c     WHERE     EXISTS (         SELECT         1         FROM         d         WHERE         c.c1 = d.d1     ) ;", 
    statement=std::unique_ptr<duckdb::SQLStatement> = {...}, prepared=std::shared_ptr<duckdb::PreparedStatementData> (use count 2, weak count 0) = {...}, parameters=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/main/client_context.cpp:738
#10 0x00000000033cd45c in duckdb::ClientContext::PendingStatementOrPreparedStatementInternal (this=0x615000011d90, lock=..., 
    query="SELECT     *     FROM     c     WHERE     EXISTS (         SELECT         1         FROM         d         WHERE         c.c1 = d.d1     ) ;", 
    statement=std::unique_ptr<duckdb::SQLStatement> = {...}, prepared=std::shared_ptr<duckdb::PreparedStatementData> (use count 2, weak count 0) = {...}, parameters=...)
    at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/main/client_context.cpp:695
#11 0x00000000033ca403 in duckdb::ClientContext::PendingQueryPreparedInternal (this=0x615000011d90, lock=..., 
    query="SELECT     *     FROM     c     WHERE     EXISTS (         SELECT         1         FROM         d         WHERE         c.c1 = d.d1     ) ;", 
    prepared=std::shared_ptr<duckdb::PreparedStatementData> (use count 2, weak count 0) = {...}, parameters=...) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/main/client_context.cpp:577
#12 0x00000000033ca9ae in duckdb::ClientContext::PendingQuery (this=0x615000011d90, 
    query="SELECT     *     FROM     c     WHERE     EXISTS (         SELECT         1         FROM         d         WHERE         c.c1 = d.d1     ) ;", 
    prepared=std::shared_ptr<duckdb::PreparedStatementData> (use count 2, weak count 0) = {...}, parameters=...) at /root/work/duckdb-dev/trunk/duckdb-0.7.1/src/main/client_context.cpp:584
#13 0x000000000340e3d4 in duckdb::PreparedStatement::PendingQuery (this=0x611000179c00, values=std::vector of length 0, capacity 0, allow_stream_result=false)
    at /root/work/duckdb-dev/trunk/duc
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

悟世者

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值