Python统计模块statistics用法精要

本文介绍了使用Python进行统计学基础计算的方法,包括平均值、中位数、众数、标准差等概念及其计算方式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

      有些统计学术语把握不是特别准确,担心有翻译错的,所以在不确定的地方保留了英文原文,如果有翻译错的也请路过的同行指出,多谢!
1、mean()
计算平均值
>>> import statistics
>>> statistics.mean([1, 2, 3, 4, 5, 6, 7, 8, 9])
5.0
>>> statistics.mean(range(1,10))
5.0
>>> import fractions
>>> x = [(3, 7), (1, 21), (5, 3), (1, 3)]
>>> y = [fractions.Fraction(*item) for item in x]
>>> y
[Fraction(3, 7), Fraction(1, 21), Fraction(5, 3), Fraction(1, 3)]
>>> statistics.mean(y)
Fraction(13, 21)
>>> import decimal
>>> x = ('0.5', '0.75', '0.625', '0.375')
>>> y = map(decimal.Decimal, x)
>>> y
<map object at 0x00000000033465C0>
>>> list(y)
[Decimal('0.5'), Decimal('0.75'), Decimal('0.625'), Decimal('0.375')]
>>> statistics.mean(y)
Traceback (most recent call last):
  File "<pyshell#411>", line 1, in <module>
    statistics.mean(y)
  File "C:\Python 3.5\lib\statistics.py", line 292, in mean
    raise StatisticsError('mean requires at least one data point')
statistics.StatisticsError: mean requires at least one data point
>>> list(y)
[]
>>> y = map(decimal.Decimal, x)
>>> statistics.mean(y)
Decimal('0.5625')
2、median()、median_low()、median_high()、median_grouped()
各种中位数
>>> statistics.median([1, 3, 5, 7])
4.0
>>> statistics.median_low([1, 3, 5, 7])
3
>>> statistics.median_high([1, 3, 5, 7])
5
>>> statistics.median([1, 3, 7])
3
>>> statistics.median([5, 3, 7])
5
>>> statistics.median(range(1,10))
5
>>> statistics.median_low([5, 3, 7])
5
>>> statistics.median_high([5, 3, 7])
5
>>> statistics.median_grouped([5, 3, 7])
5.0
>>> statistics.median_grouped([5, 3, 7, 1])
4.5
>>> statistics.median_grouped([52, 52, 53, 54])
52.5
>>> statistics.median_low([52, 52, 53, 54])
52
>>> statistics.median_high([52, 52, 53, 54])
53
>>> statistics.median_high([1, 3, 3, 5, 7])
3
>>> statistics.median_low([1, 3, 3, 5, 7])
3
>>> statistics.median_grouped([1, 3, 3, 5, 7])
3.25
>>> statistics.median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5])
3.7
>>> statistics.median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5], interval=2)
3.4
3、mode()
返回最常见数据或出现次数最多的数据(most common data)
>>> statistics.mode([1, 3, 5, 7])
Traceback (most recent call last):
  File "<pyshell#435>", line 1, in <module>
    statistics.mode([1, 3, 5, 7])
  File "C:\Python 3.5\lib\statistics.py", line 434, in mode
    'no unique mode; found %d equally common values' % len(table)
statistics.StatisticsError: no unique mode; found 4 equally common values
>>> statistics.mode([1, 3, 5, 7, 3])
3
>>> statistics.mode([1, 3, 5, 7, 3, 5])
Traceback (most recent call last):
  File "<pyshell#437>", line 1, in <module>
    statistics.mode([1, 3, 5, 7, 3, 5])
  File "C:\Python 3.5\lib\statistics.py", line 434, in mode
    'no unique mode; found %d equally common values' % len(table)
statistics.StatisticsError: no unique mode; found 2 equally common values
>>> statistics.mode([1, 3, 5, 7, 3, 5, 5])
5
>>> statistics.mode(["red", "blue", "blue", "red", "green", "red", "red"])
'red'
>>> statistics.mode(list(range(5)) + [3])
3
4、pstdev()
返回总体标准差(population standard deviation ,the square root of the population variance)。
>>> statistics.pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
0.986893273527251
>>> statistics.pstdev(range(20))
5.766281297335398
>>> statistics.pstdev([1, 2, 3, 4, 5, 10, 9, 8, 7, 6])
2.8722813232690143
5、pvariance()
返回总体方差(population variance)或二次矩(second moment)。
>>> statistics.pvariance([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
0.9739583333333334
>>> statistics.pvariance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6])
8.25
>>> x = [1, 2, 3, 4, 5, 10, 9, 8, 7, 6]
>>> mu = statistics.mean(x)
>>> mu
5.5
>>> statistics.pvariance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6], mu)
8.25
>>> statistics.pvariance(range(20))
33.25
>>> statistics.pvariance((random.randint(1,10000) for i in range(30)))
10903549.933333334
6、variance()、stdev()
计算样本方差(sample variance)和样本标准差(sample standard deviation,the square root of the sample variance,也叫均方差)。
>>> statistics.variance((random.randint(1,10000) for i in range(30)))
10229013.655172413
>>> statistics.stdev((random.randint(1,10000) for i in range(30)))
3106.2902337180203
>>> _ * _ #注意,上面的两个样本数据并不一样,因为都是随机数
9649039.016091954
>>> statistics.variance(range(20))
35.0
>>> statistics.stdev(range(20))
5.916079783099616
>>> _ * _
35.0
>>> statistics.variance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6])
9.166666666666666
>>> statistics.stdev([1, 2, 3, 4, 5, 10, 9, 8, 7, 6])
3.0276503540974917
>>> statistics.variance([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
1.16875
>>> statistics.stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
1.0810874155219827
>>> _ * _
1.1687500000000002
>>> statistics.variance([3, 3, 3, 3, 3, 3])
0.0
>>> statistics.stdev([3, 3, 3, 3, 3, 3])

0.0 


原文地址:http://user.qzone.qq.com/306467355/blog/1446598412

更多精彩内容请访问作者QQ空间。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

dongfuguo

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值