解决并行排序合并Join算法中数据倾斜的Skew算法 -- 摘自J. L. Wolf, D. M. Dias, and P. S. Yu. A parallel sort merge join alg

本文介绍了一种名为SKEW的任务调度算法,该算法旨在通过创建任务并将其分配给多个处理器来最小化完成所有任务所需的时间(即最小化最大完成时间)。SKEW算法考虑了处理器数量、已排序的任务集及每项任务的特性等因素,通过迭代过程不断细化任务划分,最终实现接近最优的任务调度方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Procedure: SKEW

Input: Number of processors P, ZP sets of sorted runs, {ai,p,r | i = 1, ..., CARDp,r|, one for each processor p ∈ {1, .., P} and each relation r ∈ {1, 2}, where CARDp,r is the cardinality of the sorted run of relation r at processor p, and ai,p,r is the ith tuple in this sorted run.

Output: The creation of tasks and a heuristic assignment of those tasks to the processors which approximately minimizes the makespan.

    Set the number of tasks N = 1.

    Set the top and bottom of the first task to be TOPN,p,r = 1 and BOTN,p,r = CARDp,r for each processor p = 1, .., P and each relation r = 1, 2.

    Determine the type(1 or 2) of the first task.

    Do forever

        Determine the optimal multiplicities MULTn of each type 2 task n ∈ {1, .., N}. (Set MULTn = 1 for each type 1 task n ∈ {1, .., N}.) Compute the total number of tasks to be NN = ∑n=1NMULTn. Compute the task times {TIMEnMULTn | n = 1, .., N}.

        If NN >= P then apply LPT.

        If [solution is unacceptable] then begin

            Apply GM to find the median element μ(η) for the region {TOPn,p,r, .., BOTn,p,r | p = 1, .., P, r = 1,2} consisting of the largest type 1 task n.

            The median element corresponds to a type 2 task with region {TOPn,p,r2, ..., BOTn,p,r2 | p = 1, ..., P, r = 1,2}

            Relabel this new type 2 task as task number n.

            Determine its optimal multiplicity MULTn and task time TIMEnMULTn.

            There also exist(1 or) 2 tasks, most likely of type 1, corresponding to regions {TOPn,p,r1, ..., BOTn,p,r1 | p = 1, ..., P, r = 1, 2} and {TOPn,p,r3, ..., BOTn,p,r3 | p = 1, ...,P, r = 1, 2}. Increment N (by 1 or 2) to add these tasks and their optimal multiplicities and task times.

            Sort the tasks in order of decreasing task times, so that n1 <= n2 implies TIMEn1MULTn1 >= TIMEn2MULTn2.

        End

        Else halt with solution from final LPT.

    End do

End SKEW

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值