先看伟大的高斯分布(Gaussian Distribution)的概率密度函数(probability density function):
f(x)=12π−−√σexp(−(x−μ)22σ2)
对应于numpy中:
<code class="hljs rsl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">numpy.<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">random</span>.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">normal</span>(loc=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.0</span>, scale=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>, size=None)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
参数的意义为:
<code class="hljs mel has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">loc:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">float</span> 此概率分布的均值(对应着整个分布的中心centre) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">scale</span>:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">float</span> 此概率分布的标准差(对应于分布的宽度,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">scale</span>越大越矮胖,<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">scale</span>越小,越瘦高) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">size</span>:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">int</span> or tuple of ints 输出的shape,默认为None,只输出一个值</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>
我们更经常会用到的np.random.randn(size)
所谓标准正态分布(μ=0,σ=1),对应于np.random.normal(loc=0,
scale=1, size)
。
采样(sampling)
<code class="hljs rsl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># 从某一分布(由均值和标准差标识)中获得样本</span> mu, sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.1</span> s = np.<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">random</span>.<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">normal</span>(loc=mu, scale=sigma, size=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1000</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
也可使用scipy库中的相关api(这里的类与函数更符合数理统计中的直觉):
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">import scipy<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.stats</span> as <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">st</span> mu, sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.1</span> s = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">st</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.norm</span>(mu, sigma)<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.rvs</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1000</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
校验均值和方差:
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-prompt" style="color: rgb(0, 102, 102); box-sizing: border-box;">>>> </span>abs(mu < np.mean(s)) < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.01</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span> <span class="hljs-prompt" style="color: rgb(0, 102, 102); box-sizing: border-box;">>>> </span>abs(sigma-np.std(s, ddof=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)) < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.01</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># ddof,delta degrees of freedom,表示自由度</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 一般取1,表示无偏估计, </span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>
拟合
我们看使用matplotlib.pyplot
便捷而强大的语法如何进行高斯分布的拟合:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">import matplotlib<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.pyplot</span> as plt count, bins, _ = plt<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.hist</span>(s, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">30</span>, normed=True) <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># normed是进行拟合的关键</span> <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># count统计某一bin出现的次数,在Normed为True时,可能其值会略有不同</span> plt<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.plot</span>(bins, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.</span>/(np<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.sqrt</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*np<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.pi</span>)*sigma)*np<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.exp</span>(-(bins-mu)**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>/(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*sigma**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>), lw=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, c=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) plt<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.show</span>()</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>
或者:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">s_fit = np<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linspace</span>(s<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.min</span>(), s<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.max</span>()) plt<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.plot</span>(s_fit, <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">st</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.norm</span>(mu, sigma)<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.pdf</span>(s_fit), lw=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, c=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>