关于对Fisher Exact Test 的理解

最新推荐文章于 2024-10-28 12:27:28 发布

原创最新推荐文章于 2024-10-28 12:27:28 发布 · 7.6k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#random #c

统计专栏收录该内容

1 篇文章

订阅专栏

本文介绍了费舍精确检验的基本概念，并通过实例详细解释了超几何分布的应用场景及计算方法，帮助读者理解该检验背后的概率原理。

http://en.wikipedia.org/wiki/Fisher's_exact_test

超几何分布：

	drawn	not drawn	total
white	k	m-k	m
black	n-k	N+k-m-n	N-m
totals	n	N-n	N

N个里有m个white，事件A＝｛N里取n次，刚好取到k个white｝，事件A发生的概率：

$P(X=k) = {{{m /choose k} {{N-m} /choose {n-k}}}/over {N /choose n}},$

分母：N里取n个的状态个数，

分子：n里有k个是whiet，n-k个black的状态个数

P(X=k1)+P(X=k2)+...=1 k1,k2... all possible k values

Fisher Exact Test:

n个里有a+b个dieting的，事件A＝｛n里取a＋c个，其中刚好有a个dieting的｝，事件A发生的概率：

	men	women	total
dieting	a	b	a + b
not dieting	c	d	c + d
totals	a + c	b + d	n

$p = {{{a+b}/choose{a}}{{c+d}/choose{c}}}/left/{{{n}/choose{a+c}}}/right.$

自己理解：认为总数是n, a+c为样本大小，从总体拿出这么多样本，共有choose(n,a+c)种情况。

样本里有a个dieting的，有choose(a+b,a)种情况，有c个非dieting的有choose(c+d,c)种情况。

分母：n里取a+c个的状态个数，

分子：a+c里有a个是dieting，a+c-a个not dieting的状态个数

Exact Tests

The Hypergeometric Distribution

To understand Fisher Exact test, a review of the hypergeometric
distribution is first helpful.  Here is a typical example. A box of
chocolates contains 20 (N=20) pieces. Eight of them are known to be
caramels (M=8), and the remaining 12 pieces are nuts (N-M=12). If a person
selects 4 pieces (sample size n=4) at random, what is the distribution of
the number of caramels in the sample?

n=4  - sample size
M=8  - total number of caramels
N=20 - total number of chocolates

The distribution for the number of caramels in the sample of 4 can range
from 0 to 4 and the probability of 0, 1, 2, 3, or 4 caramels will occur
is:

No. in
Sample, Probability
   0    0.1022  = choose(8,0)*choose(12,4)/choose(20,4)
   1    0.3633  = choose(8,1)*choose(12,3)/choose(20,4)
   2    0.3814  = choose(8,2)*choose(12,2)/choose(20,4)
   3    0.1387  = choose(8,3)*choose(12,1)/choose(20,4)
   4    0.0144  = choose(8,4)*choose(12,0)/choose(20,4)
 Total  1.0000

The probabilities must sum to 1 as the total shows.