TopCoder 每周一赛的一道题－－概率计算（解题报告）

最新推荐文章于 2020-07-10 17:57:18 发布

原创最新推荐文章于 2020-07-10 17:57:18 发布 · 2.7k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#vector #constraints #parameters #system #numbers #class

算法同时被 3 个专栏收录

11 篇文章

订阅专栏

组合数学

5 篇文章

订阅专栏

TopCoder

2 篇文章

订阅专栏

今天午夜开始了 TopCoder 的每周一赛，三道题目，我做了两道，等级从 700 多升到了 955 ，呵呵，再接再厉。其中第二题我觉得挺有意思的，花的时间很少，还是一次性地通过了测试^_^
下面是问题描述，英文的，应该不难，呵呵。。红色标记的是重点，加粗的是小标题，最后面斜体是版权信息。

》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》

Problem Statement

A redundant storage system can survive a certain number of hard drive failures without losing any data. You are doing some analysis to determine the risk of various numbers of drives failing during one week. Your task is complicated by the fact that the drives in this system have different failure rates. You will be given a vector <double> containing n elements that describe the probability of each drive failing during a week. Return a vector <double> containing n + 1 elements, where element i is the probability that exactly i drives will fail during a week. Assume that drive failures are independent events.

Definition

Class:
DriveFailures
Method:
failureChances
Parameters:
vector <double>
Returns:
vector <double>
Method signature:
vector <double> failureChances(vector <double> failureProb)
(be sure your method is public)

Notes

The returned value must be accurate to within a relative or absolute value of 1E-9.
If events with probabilities p1 and p2 are independent, then the probability of both occurring is p1p2.

Constraints

failureProb will contain between 1 and 15 elements, inclusive.
Each element of failureProb will be between 0.0 and 1.0, inclusive.

Examples
0)

{1.0, 0.25, 0.0}
Returns: {0.0, 0.75, 0.25, 0.0 }
The first drive is guaranteed to fail, the second has a 25% chance of failing, and the third is guaranteed not to fail. So there is a 25% of two failures and a 75% chance of only one failure.
1)

{0.4, 0.7}
Returns: {0.18000000000000002, 0.54, 0.27999999999999997 }
There is a probability of 0.4 x 0.7 = 0.28 that both drives will fail. The chance that only the first will fail is 0.12 and that only the second will fail is 0.42, for a total probability of 0.54 that exactly one drive will fail. This leaves a probability of 0.18 that no drives will fail.
2)

{0.2, 0.3, 0.0, 1.0, 0.8, 0.9}
Returns:
{0.0, 0.011199999999999993, 0.15319999999999995, 0.5031999999999999, 0.2892,
0.0432, 0.0 }

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2003, TopCoder, Inc. All rights reserved.

》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》

题目说明：
假设有 n 个驱动器(Drives)，每个驱动器 D_i (0 ≤ i _{≤n- 1} )都有出错的概率 _{P_i(0≤P_i ≤ 1)，} 假设每个驱动器是否出错是独立的，即 D _i 和 D _j 同时出错的概率是 P _{_i} * P _{_j} ，那么有 k 个驱动器出错的概率是多少呢？把 k = 0, 1, ..., n 这 n + 1 种情况都计算出来。分析：显然对于每个 k ，k 个驱动器同时发生错误的组合数是 C (n, k) ，一个很自然去解决问题的办法就是对于每个 k ，生成所有包含 k 个元素的集合，例如 n = 3, k = 2, 这些集合就是 {D 0 , D1} ， {D₀, D₂} ， {D₁, D₂} ，这些集合的总数是 C (n, k) = C (3, 2) = 3 。于是有两个驱动器出错的概率就是 P (2) = P₂P₁P₀+ P₂P₁P₀ + P₂P₁P₀ 。在对应集合中，如果某驱动器没有出现，那么它的概率取非，即 1 - P，在计算式子里面用下划线标明。依照这样的方法，那么从 k = 0 开始，一直计算到 n ，就完成了解题了。
但是这个方案需要生成组合的，就是找出所有包含 k 个元素的集合，比较麻烦，我在比赛结束后看了两个人的代码，都是用递归来实现的，显然效率低下了，不过这个比赛并不是比算法效率的，而是比在最短时间内解出题目的能力，就是说，不管白猫黑猫，抓到老鼠就是好猫。我解题的时候很快就否定了这个方案，一方面是由于虽然看过组合生成的算法，但是还没有时间去编程实现，临时抱佛脚就不好啦；另一方面，是一种直觉之类的东西，告诉我一定不需要这么麻烦的。
从那些集合里面可以很容易看出来，每个驱动器出错可以用 1 表示，不出错用 0 表示，那么每种情况下，都对应着一个唯一的 n 位的二进制数字，其中 1 有多少位那么就有多少个驱动器出错。例如 P₂P₁P₀可以用 011 表示，也就是十进制的 3 ，而且出错的驱动器数目是 2 。那么，一共有多少种情况呢（即需要多少个这样的二进制数字）？每个驱动器有两种选择，根据计数的乘法原理，n 个驱动器共有 2ⁿ 种情况。于是我们可以从 0 开始，到 2ⁿ - 1 结束，每个二进制数字通过简单的位操作来确定计算该种情况发生的概率时该选择概率 P _{_i} 还是 P _{_i} 来累乘，然后根据二进制数字里面 1 个数 k 来确定将乘积累加到 P (k) 中。于是就有如下代码，其中要求将函数写成一个类方法。

#include <vector>

using namespace std;

class DriveFailures

{

public:

vector <double> failureChances (vector <double> failureProb);

};

vector <double> DriveFailures::failureChances (vector <double> failureProb)

{

int i, j, k, m, n, low, len = failureProb.size (); // len 是驱动器数目

double p;

vector <double> successProb, ret;

for (i = 0; i < len; i ++)

{

successProb.push_back (1.0 - failureProb [i]); // 不出错的概率，先记录好免除以后重复计算

ret.push_back (0); // 每个概率 P (k) 初始化为 0 ，以便累加

}

ret.push_back (0); // 注意返回的 vector 长度是 n + 1 的

n = 1; // 计算情况总数 n = 2 ^ len

for (i = 0; i < len; i ++)

{

n *= 2;

}

for (i = 0; i < n; i ++)

{

j = i; // 要对二进制数字 i 进行右移操作的，为了保证外层循环的正确性，用另外一个变量 j 代替 i

k = 0; // k 是出错的驱动器的数目

p = 1.0;// p 是这种情况发生的概率

for (m = 0; m < len; m ++)

{

if (j & 1) // 低位是 1 ，则驱动器 m 出错，否则不出错

{

k ++; // 累加出错的驱动器的数目

p *= failureProb [m];

} else {

p *= successProb [m];

}

j >>= 1; // 将下一个驱动器右移到低位，下一次循环可以查询其出错状况

}

ret [k] += p; // 根据出错的驱动器的数目将 p 累加到 P (k) 中

}

return ret; // 将结果返回

}

就是这么简单，不需要对 k 从 0 到 n - 1 地计算，呵呵。。
这个解题报告貌似写得过于详细了，不过你们能看得轻松就好了，不枉我打这么多字，看不懂的话我就更 happy 啦。 _{_{补充一下，也可以用母函数来做，就是计算}} ( P₀+ P₀) * ( P ₁ + P ₁ ) * ... * (P_{n - 1} + P_{n - 1} ) 。这个是组合数学的内容，相关的文章请看整数拆分（找零钱）

》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
2007-07-24 appended
看到别人的代码，写得很简炼，发上来。

// __builtin_popcount is GCC's instruction for counting binary 1

vector <double> failureChances(vector <double> p)

{

int n = p.size();

vector<double> ret(n + 1, 0.0);

for(int i = 0; i < (1 << n); i++)

{

double pp = 1.0;

for(int j = 0; j < n; j++)

if(i & (1 << j))

pp *= p[j];

else