Pollard‘s rho大数分解算法

原创已于 2023-10-24 19:37:31 修改 · 4.2k 阅读

13 ·

CC 4.0 BY-SA版权

文章标签：

#1024程序员节

于 2021-04-26 14:40:10 首次发布

本文解析了Pollard's rho算法如何通过构造递推数列和模n的周期性来寻找素因子。通过将问题转化为生日问题并利用随机性和期望值，算法巧妙地检测合数因子。实践案例和代码展示了如何在HDU-1488 Flavius Josephus Reloaded问题中应用该算法。

HDU - 1488 Flavius Josephus Reloaded

一，问题

给定一个很大的整数n，求出n的一个素因子

PS：如果求不出非平凡因子，也能说明n是素数

二，Pollard's rho算法思路

1，构造递推数列

构造一个一阶至少二次的递推式，如 $a_{k+1}=a_k^2+c$

2，生成mod n的递推数列

随便取初始值，根据 $a_{k+1}\equiv a_k^2+c(mod\,n)$ 得到一个数列，那么数列的每一项都在[0,n-1]的范围内

显然，数列一定会陷入循环。

如n=30，数列{0,1,2,5,26,17,20,11,2...}

这个数列mod 5是{0,1,2,0...}

这个数列模3是{0,1,2,2...}

3，近似生日问题

生成数列是比较随机的，所以我们可以把它看做生日问题

根据生日问题_csuzhucong的博客-优快云博客中的计算，第一个重复出现的数的下标的期望，不超过 $(1+e^{-\frac{1}{2}})\sqrt k+2$ ，其中k是可以取的数的数目

下文用2sqrt(k)表示（实际上应该是2.7sqrt(k)），不用太关注常数

4，Pollard's rho算法思路

如果n有非平凡因子p，那么大概率数列mod p出现周期会比mod n出现周期早，因为前者是2sqrt(p)，后者是2sqrt(n)

按照判断链表是否有环的方法，我们依次检测 $a_i, a_{2i}$ 是否相等，即是否出现周期

因为不知道p的值，所以我们用 $gcd(a_i-a_{2i},\, n)>1$ 来判断进入了mod p的循环，这样，我们就找到了因子p

如果对于找到的第一个满足 $gcd(a_i-a_{2i},\, n)>1$ 的i，有 $a_i=a_{2i}$ ，则n有可能是素数。

如果确定n是合数，可以多尝试一些c的值，总能找到p的，

如果不确定n是不是素数，多尝试一些c的值之后还是找不到因子，那n可能就是素数。

5，时间复杂度

平均情况下的时间复杂度是sqrt(p)*tc*tn，

其中tc表示尝试不同的c的次数，tc的期望应该非常小，感觉小于2，

其中tn表示单次调用gcd函数的运行时间，基本上可以理解为log n

所以，Pollard's rho算法的平均时间复杂度是O( $n^{\frac{1}{4}} * log n$ )

6，实现

static long long gcd(long long a, long long b)
{
	if (b == 0)return a;
	return gcd(b, a % b);
}
int findDiv(long long n) //n*2不超过long long max
{
	int n4 = pow(n, 1.0 / 4) * 10;
	vector<int>vc{ 1,13,59,97,311,499,997,1201,2003,3533,7177,12799,19997,25111,47777,49999,1000000007 };
	vector<long long>v(n4);
	for (auto c : vc) {
		v[0] = rand()%n;
		for (int i = 1; i < v.size(); i++) {
			v[i] = (Multi::multiAdd(v[i - 1], 2, n) + c) % n;
			long long dif = abs(v[i] - v[i / 2]);
			if (dif && gcd(dif, n) > 1)
				return dif;
		}
		continue;
	}
	return 1;
}

理论上n4 = pow(n, 1.0 / 4) * 2.7就够了，但是这只是按照均值算，n4越大准确率越高。

7，测试

int main()
{
	for (long long pi = 10;; pi*=10) {
		for (long long n = pi;; n++) {
			if (findDiv(n) == 1) {
				cout << n << endl;
				break;
			}
		}
		if (pi == 1000000000000000000)break;
	}
	return 0;
}

从结果来看，准确率还不是特别高，素数肯定不会被当做合数，但是合数有可能被当成素数。

我觉得难点在于哈希部分，或许我这里用的哈希函数太简单了，导致有某种类似于“模素数的数据群集”现象。

三，OJ实战

HDU - 1488 Flavius Josephus Reloaded

Flavius Josephus once was trapped in a cave together with his comrade soldiers surrounded by Romans. All of Josephus' fellow soldiers preferred not to surrender but to commit suicide. So they all formed a circle and agreed on a number k. Every k-th person in the circle would then commit suicide. However, Josephus had different priorities and didn't want to die just yet. According to the legend he managed to find the safe spot in the circle where he would be the last one to commit suicide. He surrendered to the Romans and became a citizen of Rome a few years later.

It is a lesser known fact that the souls of Josephus and his comrades were all born again in modern times. Obviously Josephus and his reborn fellow soldiers wanted to avoid a similar fiasco in the future. Thus they asked a consulting company to work out a better decision scheme. The company came up with the following scheme:

1.For the sake of tradition all soldiers should stand in a circle. This way a number between 0 and N-1 is assigned to each soldier, where N is the number of soldiers.
2.As changing numbers in the old scheme turned out to be horribly inefficient, the number assigned to a soldier will not change throughout the game.
3.The consulting company will provide two numbers a and b which will be used to calculate the number of the next soldier as follows: Let x be the number of the current soldier, then the number of the next soldier is the remainder of a*x^2 + b mod N.
4.We start with the soldier with number 0 and each soldier calculates the number of the next soldier according to the formula above.
5.As everyone deserves a second chance a soldier will commit suicide once his number is calculated for the second time.
6.In the event that the number of a soldier is calculated for the third time the game will end and all remaining soldiers will surrender.

You are to write a program that given the number of soldiers N and the constants a and b determines the number of survivors.

Input

The input file consists of several test cases. Each test case consists of a single line containing the three integers N (2 ≤ N ≤ 10^9), a and b (0 ≤ a,b < N) separated by white space. You may safely assume that the first soldier dies after no more than one million (10^6) steps. The input is terminated by a single number 0 which should not be processed.

Output

For each test case output a single line containing the number of soldiers that survive.

Sample Input

2 1 1
5 1 1
10 3 7
101 9 2
698253463 1 181945480
1000000000 999999999 999999999
0

Sample Output

题意：

输入n,a,b，生成mod n的数列，初始项为0，后面每项为前一项的平方乘以a再加b

如果有个数出现了3次，那么就停止，对于出现了2次的数，对应的士兵杀掉，问还剩多少士兵？

分析：

这个其实并不是Pollard‘s rho算法，但是计算过程是差不多的，只是在这个数列中搜寻的目标不同而已。

#include<iostream>
#include<unordered_map>
using namespace std;


int main()
{
    ios::sync_with_stdio(false);
    long long n,a,b;
    unordered_map<long long,int>m;
    while(cin>>n){
        if(n==0)break;
        cin>>a>>b;
        m.clear();
        long long x=0,ans=n;
        while(true) {
            if (m[x] == 1)ans--;
            m[x]++;
            if(m[x]==3)break;
            x = (x * x % n * a + b) % n;
        }
        cout<<ans<<endl;
    }
    return 0;
}