数据分析:pandas.skew 复现

最近由于使用 pandas 和 numpy做数据分析,以及需要把算法迁移到go上,发现了pandas 在处理一些统计项的时候,其中的参数的default 和numpy里面有问题,做个记录。

pandas.skew实现(rolling 同理)

import pandas as pd

# 示例数据
data = pd.Series([1, 2, 3, 4, 5,6, 7, 8, 9, 10, 50])
skewness_pandas = data.skew()
print(f"Pandas Skewness: {skewness_pandas}")

Pandas Skewness: 3.0536609583638397

拆公式直接实现(numpy)

import numpy as np

def calculate_skew_manual(series: pd.Series) -> float:
    """
    手动计算时序数据的偏度(Skewness),完全匹配 pandas 的 skew 方法。
    """
    data = series.dropna()  # 去除缺失值
    n = len(data)           # 样本量

    mean = np.mean(data)   # 计算均值
    std = np.std(data,ddof= 0)  # 使用样本标准差 (ddof=0)
    numerator = np.mean((data - mean) ** 3)  # 分子
    denominator = std ** 3  # 分母

    # 偏度校正因子
    correction_factor = np.sqrt(n * (n - 1)) / (n - 2)
    skewness = correction_factor * (numerator / denominator)
    return skewness

# 示例数据
data = pd.Seri
输出是 鏁版嵁闆嗘憳瑕�: 鏈堢閲� 鍑虹鐜� 绌虹疆鐜� GDP澧為�� 鏂板叴浜т笟鍗犳瘮 鏂板渚涘簲 浜烘墠鍑�娴佸叆 绉熼噾鏀剁泭鎸囨暟 渚涢渶骞宠 鎸囨暟 count 16.000000 16.000000 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000 mean 129.750000 81.562500 17.437500 5.618750 31.531250 99.80625 1.681250 105.795125 7.832530 std 56.414537 7.869339 9.291206 1.250716 8.877666 48.87634 1.297803 47.612849 4.307680 min 40.000000 61.800000 5.400000 3.800000 18.900000 35.70000 -0.500000 33.520000 1.961165 25% 84.750000 76.825000 10.050000 4.475000 22.725000 59.85000 0.600000 71.872500 4.876060 50% 131.000000 83.950000 13.900000 5.850000 31.050000 89.95000 2.200000 100.906500 7.601001 75% 172.750000 87.075000 27.625000 6.700000 39.000000 145.77500 2.650000 146.284250 9.242916 max 234.000000 90.000000 32.600000 7.400000 44.100000 174.20000 3.400000 193.752000 18.753247 ================================================== 鍥炲綊鍒嗘瀽缁撴灉 ================================================== 绾挎�у洖褰掔粨鏋滄憳瑕�: OLS Regression Results ============================================================================== Dep. Variable: 鍑虹鐜� R-squared: 0.000 Model: OLS Adj. R-squared: -0.071 Method: Least Squares F-statistic: 0.0008390 Date: 鍛ㄤ竴, 09 6鏈� 2025 Prob (F-statistic): 0.977 Time: 01:03:07 Log-Likelihood: -55.194 No. Observations: 16 AIC: 114.4 Df Residuals: 14 BIC: 115.9 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 81.7026 5.248 15.568 0.000 70.446 92.959 鏈堢閲� -0.0011 0.037 -0.029 0.977 -0.081 0.079 ============================================================================== Omnibus: 5.039 Durbin-Watson: 2.468 Prob(Omnibus): 0.081 Jarque-Bera (JB): 2.835 Skew: -1.009 Prob(JB): 0.242 Kurtosis: 3.428 Cond. No. 363. ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. 澶氶」寮忓洖褰扲虏鍊�: 0.0722 宸蹭繚瀛樺洖褰掑垎鏋愮粨鏋滃浘鍍� ================================================== 鑱氱被鍒嗘瀽缁撴灉 ================================================== 宸蹭繚瀛樿仛绫诲垎鏋愮粨鏋滃浘鍍� ================================================== 鍏宠仈瑙勫垯鍒嗘瀽缁撴灉 ================================================== D:\conda\Lib\site-packages\mlxtend\frequent_patterns\fpcommon.py:109: DeprecationWarning: DataFrames with non-bool types result in worse computationalperformance and their support might be discontinued in the future.Please use a DataFrame with bool type warnings.warn( 寮哄叧鑱旇鍒�: 瑙勫垯 support confidence lift {'楂�'} 鈫� {'浣�'} 0.3125 0.833333 2.666667 {'楂�', '浼�'} 鈫� {'浣�'} 0.3125 0.833333 2.666667 {'楂�'} 鈫� {'浣�', '浼�'} 0.3125 0.833333 2.666667 {'浣�'} 鈫� {'楂�'} 0.3125 1.000000 2.666667 {'浣�', '浼�'} 鈫� {'楂�'} 0.3125 1.000000 2.666667 {'浣�'} 鈫� {'楂�', '浼�'} 0.3125 1.000000 2.666667 宸蹭繚瀛樺叧鑱旇鍒欏垎鏋愬浘鍍� ================================================== 涓氬姟绛栫暐寤鸿 ================================================== 鍚勫煄甯傜瓥鐣ュ缓璁�: 鍩庡競 鏈堢閲� 鍑虹鐜� 绌虹疆鐜� 绛栫暐寤鸿 鍖椾含 234.0 82.8 29.1 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠29.1%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎺ㄥ嚭鍘诲寲浼樻儬鏀跨瓥锛屾彁鍗囪祫浜у洖鎶ョ巼 涓婃捣 171.0 77.3 13.1 涓瓑甯傚満瀹氫綅锛氫紭鍖栫鎴风粨鏋�(鍑虹鐜囷細77.3%)锛屾彁鍗囨湇鍔℃按骞� 骞垮窞 105.0 74.4 10.2 楂樼甯傚満瀹氫綅锛氱淮鎸侀珮浠风瓥鐣�(105.0鍏�/骞崇背)锛屽惛寮曞ご閮ㄧ鎶�浼佷笟 娣卞湷 196.0 77.0 27.6 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠27.6%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎺ㄥ嚭鍘诲寲浼樻儬鏀跨瓥 鏉窞 86.0 85.5 32.6 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠32.6%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎺ㄥ嚭鍘诲寲浼樻儬鏀跨瓥锛屾彁鍗囪祫浜у洖鎶ョ巼 鍗椾含 178.0 86.5 9.6 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠9.6%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎻愬崌璧勪骇鍥炴姤鐜� 鎴愰兘 161.0 89.9 18.8 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠18.8%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎻愬崌璧勪骇鍥炴姤鐜� 姝︽眽 81.0 61.8 8.1 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠8.1%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏� 澶╂触 94.0 84.1 5.4 涓瓑甯傚満瀹氫綅锛氫紭鍖栫鎴风粨鏋�(鍑虹鐜囷細84.1%)锛屾彁鍗囨湇鍔℃按骞筹紝鎻愬崌璧勪骇鍥炴姤鐜� 閲嶅簡 40.0 83.8 19.5 楂樼甯傚満瀹氫綅锛氱淮鎸侀珮浠风瓥鐣�(40.0鍏�/骞崇背)锛屽惛寮曞ご閮ㄧ鎶�浼佷笟锛屾彁鍗囪祫浜у洖鎶ョ巼 瑗垮畨 132.0 76.3 27.7 涓瓑甯傚満瀹氫綅锛氫紭鍖栫鎴风粨鏋�(鍑虹鐜囷細76.3%)锛屾彁鍗囨湇鍔℃按骞筹紝鎺ㄥ嚭鍘诲寲浼樻儬鏀跨瓥 鑻忓窞 130.0 88.8 12.4 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠12.4%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎻愬崌璧勪骇鍥炴姤鐜� 闀挎矙 197.0 85.9 31.2 椋庨櫓鎺у埗锛氱閲戣皟鏁�(褰撳墠31.2%绌虹疆鐜�)锛屾帰绱㈠叡浜姙鍏紝鎺ㄥ嚭鍘诲寲浼樻儬鏀跨瓥锛屾彁鍗囪祫浜у洖鎶ョ巼 闈掑矝 55.0 90.0 10.3 楂樼甯傚満瀹氫綅锛氱淮鎸侀珮浠风瓥鐣�(55.0鍏�/骞崇背)锛屽惛寮曞ご閮ㄧ鎶�浼佷笟锛屾彁鍗囪祫浜у洖鎶ョ巼 閮戝窞 141.0 71.7 8.7 涓瓑甯傚満瀹氫綅锛氫紭鍖栫鎴风粨鏋�(鍑虹鐜囷細71.7%)锛屾彁鍗囨湇鍔℃按骞� 鍚堣偉 75.0 89.2 14.7 涓瓑甯傚満瀹氫綅锛氫紭鍖栫鎴风粨鏋�(鍑虹鐜囷細89.2%)锛屾彁鍗囨湇鍔℃按骞筹紝鎻愬崌璧勪骇鍥炴姤鐜� 姝e湪淇濆瓨缁撴灉... 鍒嗘瀽瀹屾垚! 缁撴灉宸蹭繚瀛樺埌'鍐欏瓧妤煎競鍦哄垎鏋愮粨鏋�.xlsx'鏂囦欢 这样的怎么解决
06-10
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值