SQL SERVER中PERCENTILE_CONT和PERCENTILE_DISC
PERCENTILE_CONT和PERCENTILE_DISC
看下面一组SQL语句:
WITH test
as
(
select N'LeeWhoeeUniversity' as name,10 as score
UNION ALL
select N'LeeWhoeeUniversity',20
UNION ALL
select N'LeeWhoeeUniversity',30
UNION ALL
select N'LeeWhoeeUniversity',40
UNION ALL
select N'LeeWhoeeUniversity',50
UNION ALL
select N'DePaul',60
UNION ALL
select N'DePaul',70
UNION ALL
select N'DePaul',80
UNION ALL
select N'DePaul',90
UNION ALL
select N'DePaul',100
)
select name,score
,PERCENT_RANK() over(partition by name order by score) as per_rnk
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_5
,PERCENTILE_CONT(0.6) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_6
,PERCENTILE_CONT(0.7) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_7
,PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY score) over(partition by name) as percont0_75
,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_5
,PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_6
,PERCENTILE_DISC(0.7) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_7
,PERCENTILE_DISC(0.75) WITHIN GROUP (ORDER BY score) over(partition by name) as perdist0_75
from test运行结果:
name score per_rnk percont0_5 percont0_6 percont0_7 percont0_75 perdist0_5 perdist0_6 perdist0_7 perdist0_75
LeeWhoeeUniversity 10 0 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 20 0.25 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 30 0.5 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 40 0.75 30 34 38 40 30 30 40 40
LeeWhoeeUniversity 50 1 30 34 38 40 30 30 40 40
DePaul 60 0 80 84 88 90 80 80 90 90
DePaul 70 0.25 80 84 88 90 80 80 90 90
DePaul 80 0.5 80 84 88 90 80 80 90 90
DePaul 90 0.75 80 84 88 90 80 80 90 90
DePaul 100 1 80 84 88 90 80 80 90 90
简单理解,PERCENT_RANK前面介绍过,求出score百分比后,PERCENTILE_CONT和PERCENTILE_DISC就是根据百分比求出对应的score。但不同的是,某一百分比没有对应的score时,PERCENTILE_CONT会根据百分比的偏差计算出一个新值,此值可能并不存在于score中。PERCENTILE_DISC得出的是偏向于最近百分比对应的score值,因此此值肯定存在于score中。
如PERCENTILE_CONT(0.6)对应的值为34,偏差值我猜测是这样计算出的:百分比75%和50%PERCENTILE_CONT计算出的值分别是40和30.那么等式:(40-30)/(0.75-0.5)=新偏差值/0.6-0.5。PERCENTILE_CONT(0.6)在50%上的偏差值=4。所以,PERCENTILE_CONT(0.6)对应PERCENTILE_CONT(0.5)+4=34。
再看PERCENTILE_DISC(0.6),直接将最近的0.5计算出的,PERCENTILE_DISC值取过来了。
本文通过示例讲解了SQL Server中的PERCENTILE_CONT和PERCENTILE_DISC函数的使用方法。这两种函数用于计算数据集中的连续百分位数和离散百分位数,区别在于连续百分位数会计算出可能不在原始数据集中的值,而离散百分位数则选取最接近所求百分位的实际数据值。
6595

被折叠的 条评论
为什么被折叠?



