周报5_YMK

部署运行你感兴趣的模型镜像

周报5

论文:FLASHDECODING++: FASTER LARGE LANGUAGE MODEL INFERENCE ON GPUS

https://arxiv.org/pdf/2311.01282.pdf

在斯坦福大学团队的 Tri Dao 等人提出了 FlashAttention 和 FlashDecoding 后,相关的工作又被很快提出,上周来自无问芯穹(Infinigence-AI)、清华大学和上海交通大学的联合团队提出了一种新方法 FlashDecoding++,因为该工作并不是出自FlashAttention 和 FlashDecoding的团队,所以也被叫做野生的 FlashDecoding

Asynchronized Softmax with Unified Maximum Value

回顾上周讲的 FlashDecoding,在 FlashAttention 的基础上引入了新的并行维度:keys/values的序列长度,然而,在每一个块的内部,计算过程还是同步的(主要是局部最大值)。本文发现这种同步操作的开销约为20%。因此,作者希望去除同步操作,也就是独立计算出partial softmax结果。

在这里插入图片描述

Softmax的演变:

navie softmax
yi=exi∑iVexi y_i=\frac{e^{x_i}}{\sum^V_{i}{e^{x_i}}} yi=iVexiexi
在这里插入图片描述

safe softmax

由于在实际的计算中,指数计算exp存在不稳定性,比如数值容易溢出,超过一定范围计算精度会下降等问题。因此在实际使用中,往往用safe softmax更好,safe softmax的计算是在navie softmax的基础之上将数组x[1…n]每个元素减去数组的最大值max之后,再做softmax
yi=exi−maxk=1Vxk∑j=1Vexj−maxk=1Vxk y_i=\frac{e^{x_i-max_{k=1}^Vx_k}}{\sum_{j=1}^Ve^{x_j-max_{k=1}^Vx_k}} yi=j=1Vexjmaxk=1Vxkeximaxk=1Vxk
在这里插入图片描述

online softmax

是在safe softmax的基础上做的改进

在这里插入图片描述

其中dj−1d_{j-1}dj1表示数组x[1…n]的前j-1个指数和,它的指数和是基于前j-1个元素的最大值mj−1m_{j-1}mj1来算的的,注意哦mj−1m_{j-1}mj1并不是全局的最大值,同理mjm_{j}mj表示前j个元素的最大值,那么它跟mj−1m_{j-1}mj1的区别在于,它有可能等于mj−1m_{j-1}mj1,也有可能是最新进了的第j个元素xjx_{j}xj.

能看出分块softmax在求的时候依赖于上一个块的max,为了在块的内部也做到并行,作者提出的方法很简单:就是找到一个合适的公共最大值ϕ\phiϕ。然而,如果ϕ\phiϕ太大,会造成exi−ϕe^{{x_i}−ϕ}exiϕ溢出;如果ϕϕϕ太小,会造成exi−ϕe^{{x_i}−ϕ}exiϕ精度损失。于是作者进行了统计,如下图所示。例如,对于Llama2-7B, >超过99.99%的值在[-16.8, 6.5]之间。

在这里插入图片描述

但是对于OPT-6.7B来说,其范围较大,于是作者采用动态调整策略,如果在推理过程中发现设置的ϕ\phiϕ不合理,那么就终止当前操作,然后采用FlashAttention和FlashDecoding的方法计算softmax。不过怎科学拍出 ϕ\phiϕ 这个数,作者也没给出具体方法。

但是也有人对这个工作表示质疑,一个是ϕ\phiϕ 的选择是很困难的,极易造成精度的下降,另一个是该优化未必能带来实际的加速。

因为FlashDecoding中:

  • thread block层次并行度是:sequence_len/block_size

  • block内thread的并行粒度是:Tile Size,且block_size>Tile Size

在FlashDecoding++中:

  • thread block层次并行度是:sequence_len/Tile Size

  • block内thread的并行粒度仍然是:Tile Size

这样不一定对Occupancy(GPU 上同时活跃的线程数量与线程块容量的比率)提升有确定性帮助,如果FlashDecoding的thread block切分比较好把SM沾满,SM内部即使串行计算不同Tile并不一定有什么问题。

Flat GEMM Optimization with Double Buffering

Decoding阶段的过程主要由GEMV(batch size=1)或flat GEMM(batch size>1)。GEMV/GEMM运算可以用M、N、K来表示,其中两个相乘矩阵的大小分别为M × K和K × N。

一般LLM推理引擎利用Tensor Core使用cuBLAS和CUTLASS等库来加速。Tensor Core 在处理矩阵乘法(GEMM)操作时,通常对 M 和 N 的维度有一些优化。当 M 和 N 维度是 8 的倍数时,可以充分发挥其优势。

但是,在解码阶段,可能会执行矩阵向量乘法(GEMV)或扁平化矩阵乘法(Flat GEMM)等操作。在这些特定的操作中,M 的维度可能相对较小,远远小于 64。

由于 Tensor Core 的优化通常期望 M 和 N 的维度是 8 的倍数,所以在解码阶段,如果 M 维度较小,填充零以满足 Tensor Core 期望的维度可能导致计算利用率下降。这是因为填充的零可能占用了矩阵中的大部分空间,而这些零对实际计算没有贡献,从而浪费了计算资源。

若假设N维度上和K维度上的tiling size分别为BNB_NBNBKB_KBK,那么每个GEMM tile的计算量为2×M×BN×BK2×M×B_N×B_K2×M×BN×BK(这里的2表示乘加2次),总共有B=N×KBN×BKB=\frac{N×K}{B_N×B_K}B=BN×BKN×K个GEMM tiles。总内存访问量为(M×BK+BN×BK)×B+M×N(M×B_K+B_N×B_K)×B+M×N(M×BK+BN×BK)×B+M×N。因此,计算和内存比为:

在这里插入图片描述

另一方面,tiling后的并行度N/BNN/B_NN/BN

**于是作者发现了:计算和内存比与BNB_NBN正相关,而并行度与BNB_NBN负相关。**下图展示了GEMM在不同BNB_NBN和N下的性能(归一化后)。本文总结了两个关键结论:

  1. NNN较小时,flat GEMM是parallelism-bounded。NVIDIA Tesla A100中有108个Streaming Multiprocessors (SMs),于是应该将N/BNN/B_NN/BN设置为一个相关的数(128或256)。
  2. NNN 较大时,flat GEMM是memory-bounded。通过隐藏memory access latency可以提高性能。
    在这里插入图片描述

为了隐藏memory access latency,本文引入了double buffering技术。具体来说就是在共享内存中分配两个buffer,一个buffer用于执行当前tile的GEMM计算,同时另一个buffer则加载下一个tile GEMM所需的数据。这样计算和内存访问是重叠的,本文在N较大时采取这种策略。

总结:没有FlashAttention和FlashDecoding惊艳,个人觉得FlashDecoding的同步处理代价不大,而且本文中动态调整softmax方法也引入了判断、终止和分支跳转等操作。

ention和FlashDecoding惊艳,个人觉得FlashDecoding的同步处理代价不大,而且本文中动态调整softmax方法也引入了判断、终止和分支跳转等操作。

另外,目前正在看文章 CosmoFlow: Using Deep Learning to Learn the Universe at Scale

您可能感兴趣的与本文相关的镜像

Wan2.2-I2V-A14B

Wan2.2-I2V-A14B

图生视频
Wan2.2

Wan2.2是由通义万相开源高效文本到视频生成模型,是有​50亿参数的轻量级视频生成模型,专为快速内容创作优化。支持480P视频生成,具备优秀的时序连贯性和运动推理能力

现在我们有了执行计划,你分析一下哪慢:QUERY PLAN Limit (cost=9550.97..9800.70 rows=1 width=38) (actual time=9795.278..9795.400 rows=1 loops=1) -> Nested Loop (cost=9550.97..9800.70 rows=1 width=38) (actual time=9795.275..9795.396 rows=1 loops=1) Join Filter: (((pm.original_store_code)::text = (pm_1.original_store_code)::text) AND ((ym.group_number)::text = (ym_1.group_number)::text) AND ((ym.host_cycle_code)::text = (ym_1.host_cycle_code)::text) AND ((ym.store_cycle_code)::text = (ym_1.store_cycle_code)::text)) -> Group (cost=4704.81..4841.78 rows=1 width=29) (actual time=3247.129..3247.199 rows=1 loops=1) Group Key: pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code -> Gather Merge (cost=4704.81..4841.77 rows=1 width=29) (actual time=3247.128..3247.196 rows=1 loops=1) Workers Planned: 1 Workers Launched: 1 -> Incremental Sort (cost=3704.80..3841.65 rows=2 width=29) (actual time=1623.114..1623.117 rows=1 loops=2) Sort Key: pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code Presorted Key: pm.original_store_code Full-sort Groups: 1 Sort Method: quicksort Average Memory: 30kB Peak Memory: 30kB Pre-sorted Groups: 1 Sort Method: quicksort Average Memory: 45kB Peak Memory: 45kB Worker 0: Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB -> Nested Loop (cost=3568.02..3841.56 rows=1 width=29) (actual time=1622.343..1622.884 rows=131 loops=2) -> Merge Left Join (cost=3567.74..3628.53 rows=522 width=21) (actual time=1621.903..1622.179 rows=228 loops=2) Merge Cond: (((pm.original_store_code)::text = (ymk.original_store_code)::text) AND ((ym.host_cycle_code)::text = (ymk.host_cycle_code)::text) AND ((ym.store_cycle_code)::text = (ymk.store_cycle_code)::text) AND ((ym.information_category_code)::text = (ymk.information_category_code)::text)) Filter: ((((ymk.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk.apply_start_date) AND ('2023-11-09'::date <= ymk.apply_end_date)) OR (ymk.information_category_code IS NULL)) -> Sort (cost=3541.22..3552.58 rows=4542 width=21) (actual time=1621.878..1622.115 rows=228 loops=2) Sort Key: pm.original_store_code, ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code Sort Method: external merge Disk: 13800kB Worker 0: Sort Method: quicksort Memory: 25kB -> Merge Join (cost=3008.64..3265.32 rows=4542 width=21) (actual time=29.380..351.880 rows=226995 loops=2) Merge Cond: (((ym.pattern_type)::text = (pm.pattern_type)::text) AND ((ym.pattern_code)::text = (pm.pattern_code)::text)) -> Sort (cost=2045.25..2090.67 rows=18168 width=21) (actual time=26.306..27.871 rows=15434 loops=2) Sort Key: ym.pattern_type, ym.pattern_code Sort Method: quicksort Memory: 3181kB Worker 0: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on m_reading_number_by_pattern_1109_036 ym (cost=0.00..759.94 rows=18168 width=21) (actual time=0.018..5.019 rows=15443 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '1109_036'::text)) -> Sort (cost=963.39..988.39 rows=10000 width=14) (actual time=6.142..27.592 rows=455991 loops=1) Sort Key: pm.pattern_type, pm.pattern_code Sort Method: quicksort Memory: 853kB -> Seq Scan on m_pattern_10010001 pm (cost=0.00..299.00 rows=10000 width=14) (actual time=0.034..2.930 rows=10000 loops=1) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '10010001'::text)) -> Sort (cost=26.52..27.32 rows=320 width=102) (actual time=0.041..0.042 rows=1 loops=1) Sort Key: ymk.original_store_code, ymk.host_cycle_code, ymk.store_cycle_code, ymk.information_category_code Sort Method: quicksort Memory: 25kB -> Seq Scan on m_reading_number_by_store ymk (cost=0.00..13.20 rows=320 width=102) (actual time=0.025..0.025 rows=1 loops=1) -> Index Only Scan using m_staff_by_information_order_pkey on m_staff_by_information_order jtm (cost=0.29..0.40 rows=1 width=29) (actual time=0.003..0.003 rows=1 loops=455) Index Cond: ((original_store_code = (pm.original_store_code)::text) AND (host_cycle_code = (ym.host_cycle_code)::text) AND (store_cycle_code = (ym.store_cycle_code)::text) AND (information_category_code = (ym.information_category_code)::text)) Heap Fetches: 0 -> Group (cost=4846.16..4958.88 rows=1 width=29) (actual time=6548.141..6548.193 rows=1 loops=1) Group Key: pm_1.original_store_code, ym_1.group_number, ym_1.host_cycle_code, ym_1.store_cycle_code, ym_1.information_category_code, jtm_1.setting_date -> Incremental Sort (cost=4846.16..4958.85 rows=2 width=29) (actual time=6548.139..6548.191 rows=1 loops=1) Sort Key: pm_1.original_store_code, ym_1.group_number, ym_1.host_cycle_code, ym_1.store_cycle_code, ym_1.information_category_code, jtm_1.setting_date Presorted Key: pm_1.original_store_code, ym_1.group_number Full-sort Groups: 1 Sort Method: quicksort Average Memory: 30kB Peak Memory: 30kB Pre-sorted Groups: 1 Sort Method: quicksort Average Memory: 32kB Peak Memory: 32kB -> Nested Loop Left Join (cost=4733.53..4958.76 rows=1 width=29) (actual time=6546.392..6547.850 rows=97 loops=1) Filter: ((((ymk_1.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk_1.apply_start_date) AND ('2023-11-09'::date <= ymk_1.apply_end_date)) OR (ymk_1.information_category_code IS NULL)) -> Nested Loop (cost=4733.38..4958.55 rows=1 width=29) (actual time=6546.377..6547.735 rows=97 loops=1) -> Group (cost=4732.69..4897.53 rows=1 width=23) (actual time=6546.292..6546.369 rows=24 loops=1) Group Key: pm_2.original_store_code, ym_2.group_number, jtm_1.setting_date, jtm_1.host_cycle_code, jtm_1.store_cycle_code -> Gather Merge (cost=4732.69..4897.52 rows=1 width=23) (actual time=6546.290..6546.349 rows=64 loops=1) Workers Planned: 1 Workers Launched: 1 -> Incremental Sort (cost=3732.68..3897.40 rows=2 width=23) (actual time=3271.945..3271.949 rows=32 loops=2) Sort Key: pm_2.original_store_code, ym_2.group_number, jtm_1.setting_date, jtm_1.host_cycle_code, jtm_1.store_cycle_code Presorted Key: pm_2.original_store_code Full-sort Groups: 1 Sort Method: quicksort Average Memory: 30kB Peak Memory: 30kB Pre-sorted Groups: 1 Sort Method: quicksort Average Memory: 45kB Peak Memory: 45kB Worker 0: Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB -> Nested Loop (cost=3568.02..3897.31 rows=1 width=23) (actual time=3271.191..3271.688 rows=131 loops=2) -> Merge Left Join (cost=3567.74..3628.53 rows=522 width=21) (actual time=3270.971..3271.034 rows=228 loops=2) Merge Cond: (((pm_2.original_store_code)::text = (ymk_2.original_store_code)::text) AND ((ym_2.host_cycle_code)::text = (ymk_2.host_cycle_code)::text) AND ((ym_2.store_cycle_code)::text = (ymk_2.store_cycle_code)::text) AND ((ym_2.information_category_code)::text = (ymk_2.information_category_code)::text)) Filter: ((((ymk_2.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk_2.apply_start_date) AND ('2023-11-09'::date <= ymk_2.apply_end_date)) OR (ymk_2.information_category_code IS NULL)) -> Sort (cost=3541.22..3552.58 rows=4542 width=21) (actual time=3270.944..3270.968 rows=228 loops=2) Sort Key: pm_2.original_store_code, ym_2.host_cycle_code, ym_2.store_cycle_code, ym_2.information_category_code Sort Method: external sort Disk: 15584kB Worker 0: Sort Method: quicksort Memory: 25kB -> Merge Join (cost=3008.64..3265.32 rows=4542 width=21) (actual time=118.146..277.069 rows=226995 loops=2) Merge Cond: (((ym_2.pattern_type)::text = (pm_2.pattern_type)::text) AND ((ym_2.pattern_code)::text = (pm_2.pattern_code)::text)) -> Sort (cost=2045.25..2090.67 rows=18168 width=21) (actual time=115.483..212.185 rows=15434 loops=2) Sort Key: ym_2.pattern_type, ym_2.pattern_code Sort Method: quicksort Memory: 3181kB Worker 0: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on m_reading_number_by_pattern_1109_036 ym_2 (cost=0.00..759.94 rows=18168 width=21) (actual time=0.004..3.696 rows=15443 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '1109_036'::text)) -> Sort (cost=963.39..988.39 rows=10000 width=14) (actual time=5.316..32.792 rows=455991 loops=1) Sort Key: pm_2.pattern_type, pm_2.pattern_code Sort Method: quicksort Memory: 853kB -> Seq Scan on m_pattern_10010001 pm_2 (cost=0.00..299.00 rows=10000 width=14) (actual time=0.027..2.123 rows=10000 loops=1) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '10010001'::text)) -> Sort (cost=26.52..27.32 rows=320 width=102) (actual time=0.045..0.046 rows=1 loops=1) Sort Key: ymk_2.original_store_code, ymk_2.host_cycle_code, ymk_2.store_cycle_code, ymk_2.information_category_code Sort Method: quicksort Memory: 25kB -> Seq Scan on m_reading_number_by_store ymk_2 (cost=0.00..13.20 rows=320 width=102) (actual time=0.024..0.025 rows=1 loops=1) -> Index Scan using m_staff_by_information_order_pkey on m_staff_by_information_order jtm_1 (cost=0.29..0.50 rows=1 width=24) (actual time=0.002..0.002 rows=1 loops=455) Index Cond: (((original_store_code)::text = (pm_2.original_store_code)::text) AND ((host_cycle_code)::text = (ym_2.host_cycle_code)::text) AND ((store_cycle_code)::text = (ym_2.store_cycle_code)::text) AND ((information_category_code)::text = (ym_2.information_category_code)::text)) -> Nested Loop (cost=0.70..60.96 rows=5 width=21) (actual time=0.034..0.056 rows=4 loops=24) -> Index Scan using m_pattern_10010001_pkey on m_pattern_10010001 pm_1 (cost=0.29..15.41 rows=5 width=14) (actual time=0.004..0.005 rows=5 loops=24) Index Cond: (((version)::text = '10010001'::text) AND ((original_store_code)::text = (pm_2.original_store_code)::text) AND (apply_start_date <= '2023-11-09'::date)) Filter: ('2023-11-09'::date <= apply_end_date) -> Index Scan using m_reading_number_by_pattern_1109_036_pkey on m_reading_number_by_pattern_1109_036 ym_1 (cost=0.41..9.10 rows=1 width=21) (actual time=0.009..0.010 rows=1 loops=116) Index Cond: (((pattern_type)::text = (pm_1.pattern_type)::text) AND ((pattern_code)::text = (pm_1.pattern_code)::text) AND (apply_start_date <= '2023-11-09'::date) AND ((host_cycle_code)::text = (jtm_1.host_cycle_code)::text) AND ((store_cycle_code)::text = (jtm_1.store_cycle_code)::text) AND ((group_number)::text = (ym_2.group_number)::text) AND ((version)::text = '1109_036'::text)) Filter: ('2023-11-09'::date <= apply_end_date) -> Index Scan using m_reading_number_by_store_pkey on m_reading_number_by_store ymk_1 (cost=0.15..0.19 rows=1 width=102) (actual time=0.001..0.001 rows=0 loops=97) Index Cond: (((original_store_code)::text = (pm_1.original_store_code)::text) AND ((host_cycle_code)::text = (ym_1.host_cycle_code)::text) AND ((store_cycle_code)::text = (ym_1.store_cycle_code)::text) AND ((information_category_code)::text = (ym_1.information_category_code)::text)) Planning Time: 6.857 ms Execution Time: 9981.633 ms SQL:-- explain(analyze,buffers,verbose) EXPLAIN ANALYZE WITH wk1 AS ( SELECT pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN m_staff_by_information_order AS jtm ON pm.original_store_code = jtm.original_store_code AND ym.host_cycle_code = jtm.host_cycle_code AND ym.store_cycle_code = jtm.store_cycle_code AND ym.information_category_code = jtm.information_category_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code ), WK2 AS ( SELECT pm.original_store_code, ym.group_number, jtm.setting_date, jtm.host_cycle_code, jtm.store_cycle_code FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN m_staff_by_information_order AS jtm ON pm.original_store_code = jtm.original_store_code AND ym.host_cycle_code = jtm.host_cycle_code AND ym.store_cycle_code = jtm.store_cycle_code AND ym.information_category_code = jtm.information_category_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, jtm.setting_date, jtm.host_cycle_code, jtm.store_cycle_code ), wk3 AS ( SELECT pm.original_store_code, ym.group_number, ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code, wk2.setting_date FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN wk2 ON pm.original_store_code = wk2.original_store_code AND ym.group_number = wk2.group_number AND ym.host_cycle_code = wk2.host_cycle_code AND ym.store_cycle_code = wk2.store_cycle_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code, wk2.setting_date ) SELECT wk1.original_store_code, wk3.host_cycle_code, wk3.store_cycle_code, wk3.information_category_code, wk1.staff_code, wk3.setting_date FROM wk1 INNER JOIN wk3 ON wk1.original_store_code = wk3.original_store_code AND wk1.group_number = wk3.group_number AND wk1.host_cycle_code = wk3.host_cycle_code AND wk1.store_cycle_code = wk3.store_cycle_code limit 1 ;
08-19
-- explain(analyze,buffers,verbose) EXPLAIN ANALYZE WITH wk1 AS ( SELECT pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN m_staff_by_information_order AS jtm ON pm.original_store_code = jtm.original_store_code AND ym.host_cycle_code = jtm.host_cycle_code AND ym.store_cycle_code = jtm.store_cycle_code AND ym.information_category_code = jtm.information_category_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code ), WK2 AS ( SELECT pm.original_store_code, ym.group_number, jtm.setting_date, jtm.host_cycle_code, jtm.store_cycle_code FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN m_staff_by_information_order AS jtm ON pm.original_store_code = jtm.original_store_code AND ym.host_cycle_code = jtm.host_cycle_code AND ym.store_cycle_code = jtm.store_cycle_code AND ym.information_category_code = jtm.information_category_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, jtm.setting_date, jtm.host_cycle_code, jtm.store_cycle_code ), wk3 AS ( SELECT pm.original_store_code, ym.group_number, ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code, wk2.setting_date FROM m_pattern AS pm INNER JOIN m_reading_number_by_pattern AS ym ON pm.pattern_type = ym.pattern_type AND pm.pattern_code = ym.pattern_code AND ym.version = '1109_036' INNER JOIN wk2 ON pm.original_store_code = wk2.original_store_code AND ym.group_number = wk2.group_number AND ym.host_cycle_code = wk2.host_cycle_code AND ym.store_cycle_code = wk2.store_cycle_code LEFT JOIN m_reading_number_by_store AS ymk ON pm.original_store_code = ymk.original_store_code AND ym.host_cycle_code = ymk.host_cycle_code AND ym.store_cycle_code = ymk.store_cycle_code AND ym.information_category_code = ymk.information_category_code WHERE pm.version = '10010001' AND (( ymk.group_number NOT LIKE '0%' AND '2023-11-09 03:00:00' BETWEEN ymk.apply_start_date AND ymk.apply_end_date ) OR ymk.information_category_code IS NULL ) AND '2023-11-09 03:00:00' BETWEEN pm.apply_start_date AND pm.apply_end_date AND '2023-11-09 03:00:00' BETWEEN ym.apply_start_date AND ym.apply_end_date GROUP BY pm.original_store_code, ym.group_number, ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code, wk2.setting_date ) SELECT wk1.original_store_code, wk3.host_cycle_code, wk3.store_cycle_code, wk3.information_category_code, wk1.staff_code, wk3.setting_date FROM wk1 INNER JOIN wk3 ON wk1.original_store_code = wk3.original_store_code AND wk1.group_number = wk3.group_number AND wk1.host_cycle_code = wk3.host_cycle_code AND wk1.store_cycle_code = wk3.store_cycle_code ; 一直显示连接超时
08-19
新的执行计划看看哪慢:QUERY PLAN Merge Join (cost=13166.94..13167.03 rows=1 width=38) (actual time=7872.404..9014.874 rows=32768 loops=1) Merge Cond: (((pm.original_store_code)::text = (pm_1.original_store_code)::text) AND ((ym.group_number)::text = (ym_1.group_number)::text)) Join Filter: (((ym.host_cycle_code)::text = (ym_1.host_cycle_code)::text) AND ((ym.store_cycle_code)::text = (ym_1.store_cycle_code)::text)) Rows Removed by Join Filter: 122984 -> Group (cost=5669.68..5669.69 rows=1 width=29) (actual time=3346.894..3881.617 rows=5004 loops=1) Group Key: pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code -> Sort (cost=5669.68..5669.68 rows=1 width=29) (actual time=3346.891..3877.207 rows=16777 loops=1) Sort Key: pm.original_store_code, ym.group_number, jtm.staff_code, ym.host_cycle_code, ym.store_cycle_code Sort Method: quicksort Memory: 2079kB -> Hash Left Join (cost=5354.20..5669.67 rows=1 width=29) (actual time=3018.486..3855.427 rows=16777 loops=1) Hash Cond: (((pm.original_store_code)::text = (ymk.original_store_code)::text) AND ((ym.host_cycle_code)::text = (ymk.host_cycle_code)::text) AND ((ym.store_cycle_code)::text = (ymk.store_cycle_code)::text) AND ((ym.information_category_code)::text = (ymk.information_category_code)::text)) Filter: ((((ymk.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk.apply_start_date) AND ('2023-11-09'::date <= ymk.apply_end_date)) OR (ymk.information_category_code IS NULL)) -> Gather (cost=5334.60..5649.88 rows=1 width=34) (actual time=3018.434..3850.134 rows=16777 loops=1) Workers Planned: 1 Workers Launched: 1 -> Parallel Hash Join (cost=4334.60..4649.78 rows=1 width=34) (actual time=2858.224..3060.741 rows=8389 loops=2) Hash Cond: (((pm.pattern_type)::text = (ym.pattern_type)::text) AND ((pm.pattern_code)::text = (ym.pattern_code)::text) AND ((pm.original_store_code)::text = (jtm.original_store_code)::text)) -> Parallel Seq Scan on m_pattern_10010001 pm (cost=0.00..226.94 rows=5882 width=14) (actual time=0.030..1.780 rows=5000 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '10010001'::text)) -> Parallel Hash (cost=4333.88..4333.88 rows=41 width=41) (actual time=2791.825..2791.827 rows=1182307 loops=2) Buckets: 65536 (originally 1024) Batches: 64 (originally 1) Memory Usage: 3840kB -> Merge Join (cost=3950.80..4333.88 rows=41 width=41) (actual time=64.775..519.901 rows=1182307 loops=2) Merge Cond: (((ym.host_cycle_code)::text = (jtm.host_cycle_code)::text) AND ((ym.store_cycle_code)::text = (jtm.store_cycle_code)::text) AND ((ym.information_category_code)::text = (jtm.information_category_code)::text)) -> Sort (cost=2045.25..2090.67 rows=18168 width=21) (actual time=26.968..28.654 rows=15434 loops=2) Sort Key: ym.host_cycle_code, ym.store_cycle_code, ym.information_category_code Sort Method: quicksort Memory: 3181kB Worker 0: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on m_reading_number_by_pattern_1109_036 ym (cost=0.00..759.94 rows=18168 width=21) (actual time=0.007..8.958 rows=15443 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '1109_036'::text)) -> Sort (cost=1905.55..1955.80 rows=20099 width=29) (actual time=75.601..212.554 rows=2365650 loops=1) Sort Key: jtm.host_cycle_code, jtm.store_cycle_code, jtm.information_category_code Sort Method: quicksort Memory: 2339kB -> Seq Scan on m_staff_by_information_order jtm (cost=0.00..468.99 rows=20099 width=29) (actual time=0.030..6.775 rows=20099 loops=1) -> Hash (cost=13.20..13.20 rows=320 width=102) (actual time=0.027..0.029 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on m_reading_number_by_store ymk (cost=0.00..13.20 rows=320 width=102) (actual time=0.021..0.022 rows=1 loops=1) -> Materialize (cost=7497.27..7497.30 rows=1 width=29) (actual time=4525.501..5101.215 rows=155752 loops=1) -> Group (cost=7497.27..7497.29 rows=1 width=29) (actual time=4525.490..5089.772 rows=17721 loops=1) Group Key: pm_1.original_store_code, ym_1.group_number, ym_1.host_cycle_code, ym_1.store_cycle_code, ym_1.information_category_code, wk2.setting_date -> Sort (cost=7497.27..7497.27 rows=1 width=29) (actual time=4525.487..5085.106 rows=18085 loops=1) Sort Key: pm_1.original_store_code, ym_1.group_number, ym_1.host_cycle_code, ym_1.store_cycle_code, ym_1.information_category_code, wk2.setting_date Sort Method: quicksort Memory: 2181kB -> Hash Left Join (cost=6163.32..7497.26 rows=1 width=29) (actual time=3309.082..5047.098 rows=18085 loops=1) Hash Cond: (((pm_1.original_store_code)::text = (ymk_1.original_store_code)::text) AND ((ym_1.host_cycle_code)::text = (ymk_1.host_cycle_code)::text) AND ((ym_1.store_cycle_code)::text = (ymk_1.store_cycle_code)::text) AND ((ym_1.information_category_code)::text = (ymk_1.information_category_code)::text)) Filter: ((((ymk_1.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk_1.apply_start_date) AND ('2023-11-09'::date <= ymk_1.apply_end_date)) OR (ymk_1.information_category_code IS NULL)) -> Hash Join (cost=6143.72..7477.47 rows=1 width=29) (actual time=3309.025..5040.124 rows=18085 loops=1) Hash Cond: (((ym_1.pattern_type)::text = (pm_1.pattern_type)::text) AND ((ym_1.pattern_code)::text = (pm_1.pattern_code)::text) AND ((wk2.original_store_code)::text = (pm_1.original_store_code)::text)) -> Hash Join (cost=5669.72..6999.70 rows=1 width=36) (actual time=3303.924..4575.576 rows=2496348 loops=1) Hash Cond: (((ym_1.group_number)::text = (wk2.group_number)::text) AND ((ym_1.host_cycle_code)::text = (wk2.host_cycle_code)::text) AND ((ym_1.store_cycle_code)::text = (wk2.store_cycle_code)::text)) -> Seq Scan on m_reading_number_by_pattern_1109_036 ym_1 (cost=0.00..982.50 rows=30886 width=21) (actual time=0.015..12.938 rows=30886 loops=1) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '1109_036'::text)) -> Hash (cost=5669.70..5669.70 rows=1 width=23) (actual time=3303.897..3862.474 rows=4027 loops=1) Buckets: 4096 (originally 1024) Batches: 1 (originally 1) Memory Usage: 253kB -> Subquery Scan on wk2 (cost=5669.68..5669.70 rows=1 width=23) (actual time=3296.758..3861.178 rows=4027 loops=1) -> Group (cost=5669.68..5669.69 rows=1 width=23) (actual time=3296.756..3860.702 rows=4027 loops=1) Group Key: pm_2.original_store_code, ym_2.group_number, jtm_1.setting_date, jtm_1.host_cycle_code, jtm_1.store_cycle_code -> Sort (cost=5669.68..5669.68 rows=1 width=23) (actual time=3296.753..3856.222 rows=16777 loops=1) Sort Key: pm_2.original_store_code, ym_2.group_number, jtm_1.setting_date, jtm_1.host_cycle_code, jtm_1.store_cycle_code Sort Method: quicksort Memory: 2079kB -> Hash Left Join (cost=5354.20..5669.67 rows=1 width=23) (actual time=2914.666..3831.131 rows=16777 loops=1) Hash Cond: (((pm_2.original_store_code)::text = (ymk_2.original_store_code)::text) AND ((ym_2.host_cycle_code)::text = (ymk_2.host_cycle_code)::text) AND ((ym_2.store_cycle_code)::text = (ymk_2.store_cycle_code)::text) AND ((ym_2.information_category_code)::text = (ymk_2.information_category_code)::text)) Filter: ((((ymk_2.group_number)::text !~~ '0%'::text) AND ('2023-11-09'::date >= ymk_2.apply_start_date) AND ('2023-11-09'::date <= ymk_2.apply_end_date)) OR (ymk_2.information_category_code IS NULL)) -> Gather (cost=5334.60..5649.88 rows=1 width=33) (actual time=2914.613..3824.856 rows=16777 loops=1) Workers Planned: 1 Workers Launched: 1 -> Parallel Hash Join (cost=4334.60..4649.78 rows=1 width=33) (actual time=2839.055..3096.744 rows=8389 loops=2) Hash Cond: (((pm_2.pattern_type)::text = (ym_2.pattern_type)::text) AND ((pm_2.pattern_code)::text = (ym_2.pattern_code)::text) AND ((pm_2.original_store_code)::text = (jtm_1.original_store_code)::text)) -> Parallel Seq Scan on m_pattern_10010001 pm_2 (cost=0.00..226.94 rows=5882 width=14) (actual time=0.034..1.958 rows=5000 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '10010001'::text)) -> Parallel Hash (cost=4333.88..4333.88 rows=41 width=40) (actual time=2749.035..2749.038 rows=1182307 loops=2) Buckets: 65536 (originally 1024) Batches: 64 (originally 1) Memory Usage: 3840kB -> Merge Join (cost=3950.80..4333.88 rows=41 width=40) (actual time=25.227..451.907 rows=1182307 loops=2) Merge Cond: (((ym_2.host_cycle_code)::text = (jtm_1.host_cycle_code)::text) AND ((ym_2.store_cycle_code)::text = (jtm_1.store_cycle_code)::text) AND ((ym_2.information_category_code)::text = (jtm_1.information_category_code)::text)) -> Sort (cost=2045.25..2090.67 rows=18168 width=21) (actual time=12.502..13.951 rows=15434 loops=2) Sort Key: ym_2.host_cycle_code, ym_2.store_cycle_code, ym_2.information_category_code Sort Method: quicksort Memory: 3181kB Worker 0: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on m_reading_number_by_pattern_1109_036 ym_2 (cost=0.00..759.94 rows=18168 width=21) (actual time=0.004..3.979 rows=15443 loops=2) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '1109_036'::text)) -> Sort (cost=1905.55..1955.80 rows=20099 width=24) (actual time=25.441..171.873 rows=2365650 loops=1) Sort Key: jtm_1.host_cycle_code, jtm_1.store_cycle_code, jtm_1.information_category_code Sort Method: quicksort Memory: 2339kB -> Seq Scan on m_staff_by_information_order jtm_1 (cost=0.00..468.99 rows=20099 width=24) (actual time=0.020..3.138 rows=20099 loops=1) -> Hash (cost=13.20..13.20 rows=320 width=102) (actual time=0.035..0.037 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on m_reading_number_by_store ymk_2 (cost=0.00..13.20 rows=320 width=102) (actual time=0.021..0.022 rows=1 loops=1) -> Hash (cost=299.00..299.00 rows=10000 width=14) (actual time=4.468..4.469 rows=10000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 578kB -> Seq Scan on m_pattern_10010001 pm_1 (cost=0.00..299.00 rows=10000 width=14) (actual time=0.026..2.351 rows=10000 loops=1) Filter: (('2023-11-09'::date >= apply_start_date) AND ('2023-11-09'::date <= apply_end_date) AND ((version)::text = '10010001'::text)) -> Hash (cost=13.20..13.20 rows=320 width=102) (actual time=0.033..0.034 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on m_reading_number_by_store ymk_1 (cost=0.00..13.20 rows=320 width=102) (actual time=0.024..0.025 rows=1 loops=1) Planning Time: 8.215 ms Execution Time: 9019.218 ms
08-19
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值