序列“乱序数”的求解（更优的算法求解）

最新推荐文章于 2021-11-02 10:12:51 发布

最新推荐文章于 2021-11-02 10:12:51 发布 · 1k 阅读

文章标签：

#算法 #J#

小程序/算法专栏收录该内容

8 篇文章

订阅专栏

本文探讨了两个由相同字母组成的序列的乱序数计算问题，即两序列中元素次序不同的二元组数量。提出了四种算法，包括简单枚举法、改进的集合差集法、位置标志法，并对比了它们的时间复杂度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

[size=large][color=brown][b]题目论述：[/b][/color][/size]
[quote]两个序列由相同的字母组成，只是次序不同
两个序列的“乱序”是指
任在一个序列中选择两个元素，看在另一个序列中是否顺序不同
问题是如何数出总共有多少“乱序”

比如有两个序列
序列1为：
1，2，4，3
序列2为
2，4，1，3

序列1中（1，2）在2中是（2，1），这就是一个乱序；（1，4）在2中式（4，1）还是乱序
；逐个检查过去，数一数，就是两个序列的“乱序数”[/quote]

[b]以下是我的一些看法，可能有些疏忽或差错，还望知道。[/b]
[quote]
个人看法：
本题和最大字段和有点相像
法一：每一对进行判断(还有先比对其是不是相同二元组)
s1(i,j);
s2(i,j);
for(;i<N;)
for(j=i+1;j<N;)
for(;m<N;)
for(n=m+1;n<N;) test(s1(i,j),s2(m,n));
★时间复杂度O（n^4）

上述算法，简单的修改一下，就可稍微减少复杂度。
for(;i<N;)
for(j=i+1;j<N;)
for(;m<N;) s2[m]=s1[i];m;
for(;n<N;) s2[n]=s1[j];n;
test(i,j,m,n);
★时间复杂度O(n^3),Simple()

法二、要达到O（nlogn）,一般情况下采用分治算法，将规模每次缩小至原来的一半。

分治法的相关定义：
即T(1) = O(n)
T(n) = 2T(n/2) + O(n)
when n=2^i
then T(n) = O(nlogn)

So,here is more thinking
把是s（0，n）分成s（0，n/2)和s（n/2,n)
s1和 s2 same divided into two halves,and compare each half by itself
for the middle one,we use this:
question:s1(l,u) and s2(l,u) and middle = (l+u)/2
此部分的二元组个数为：n^2 （n指当前分半后的规模），即（n/2）2

由此看来，运算的规模还是没有有效地降低下来，我们应该想想其他办法。
。。。。。
。。。。。
想想，应该是二元组比对的效率在影响，所以应该从这方面入手
其二，大概有n^2个二元组，所以应该至少n^2次比较，而并非上述的（nlogn)

法三、以尾字符为标志
for(i=1;N;)
setA:[0...i-1]
for(j;N;)
s2[j] == s1[i] break;
setB:[0...j-1]
setA-setB;
上述主要时间复杂度在setA-setB上，即是求出两集合（差）运算。可以先排序，
使用一般的插入排序，耗时nlogn，然后以n时间集合运算。综上所述，
★时间复杂度O(n^2logn),OutofOrder()

法四、位置标志法
此法是对法一的改进
算法描述：
tmp1[];辅助数组
tmp2[];辅助数组
for(;N;)
for(;N;) test(tmp1[i],tmp1[j],tmp2[i],tmp2[j]);
★时间复杂度O(n^2),Best()[/quote]

其中，带★的后三种我都实现了，下面逐一论述：
原始序列的产生不再赘述（为简化，简单的为0到n间的数，就这一点后面还有论述，关于排序部分）；
[b]算法一：[/b]

int Simple(int *s1,int *s2,int n)
{
	//improved simple method
	int i,j,p,q,sum;
	sum = 0;
	for (i = 0;i < n-1;++ i)
	{
		for (j = i+1;j < n;++ j)
		{
			//find the first number in s2[]
			for (p = 0;p < n;++ p)
				if (s2[p] == s1[i]) break;
			//find the second number in s2[]
			for (q = 0;q < n;++ q)
				if (s2[q] == s1[j]) break;
            if (!test(i,j,p,q)) sum ++;
		}
	}
	return sum;
}

其中test（）函数是判断四个元素是否同一顺序,即升序，降序，以此决定两元组是否顺序相同。上述时间复杂度位O(n^3).

[b]算法二：[/b]

int OutofOrder(int *setA,int *setB,int n)
{
	//sum up the out order number
	int i,j,k,sum,*tmp1,*tmp2;
	tmp1 = new int [n];
	tmp2 = new int [n];
	sum = 0;
	for (i = 1;i < n;++ i)
	{
		for (j = 0;j < n;++ j)
			if (setB[j] == setA[i]) break;
		//copy the temp array for summation
		for (k = 0;k < i;++ k) tmp1[k] = setA[k];
		for (k = 0;k < j;++ k) tmp2[k] = setB[k];
		InsertSort(tmp1,i);
		InsertSort(tmp2,j);
		sum += Margin(tmp1,0,i,tmp2,0,j);//i,j not included
	}
	delete [] tmp1;
	delete [] tmp2;
	return sum;
}

上述Margin（）函数就是下面的集合差集的运算：

int Margin(int *setA,int a1,int a2,int *setB,int b1,int b2)
{
	//compute the balance number
	if (a1 == a2) return 0;
	if (b1 == b2) return a2 - a1;
    if (setA[a1] == setB[b1]) return Margin(setA,a1+1,a2,setB,b1+1,b2);
	else if (setA[a1] < setB[b1]) return 1+Margin(setA,a1+1,a2,setB,b1,b2);
	else return Margin(setA,a1,a2,setB,b1+1,b2); 
}

此算法的时间复杂度最好保持在O（n^2logn），nlogn就是上面的集合做差集的时间复杂度。

[b]算法三：[/b]

int Best(int *s1,int *s2,int n)
{
	//improved method using T(n^2)
    int i,j,*tmp1,*tmp2,sum;
	sum = 0;
	//suppose that all these number are in sequence from 1...n
	tmp1 = new int [n];
	tmp2 = new int [n];
	SignPlace_2(s1,tmp1,n);
	SignPlace_2(s2,tmp2,n);
	for (i = 0;i < n;++ i)
		for (j = i+1;j < n;++ j)
			if (!test(tmp1[i],tmp1[j],tmp2[i],tmp2[j])) sum ++;
	return sum;
}

上面的SignPlace_2（）是对无序数组求解其顺序排序后在原序列位置的函数，比如：
原序列： 1 4 5 2
位置序列：0 3 1 2
关于这个函数，我写了三个版本，一个版本是建立在原序列是不间断的一连串数字组成，所以实现起来很简单，另一版本是一个通用版本，可以忽略上述的限制条件，使用类似插入排序的算法实现。最后一个版本采用空间换取时间的方法，是对版本一的扩展。一下是相应代码:

void SignPlace_2(int *s,int *tmp,int n)
{
	//numbers are not in sequence
	//O(nlogn)
	int i,j,temp,*p;
	p = new int [n];
	//copy s[] to p[]
	for (i = 0;i < n;++ i) p[i] = s[i];
	tmp[0] = 0;
	for (i = 1;i < n;++ i)
	{
		//the current palce i
		temp = p[i];
		for (j = i;j > 0&&temp < p[j-1];-- j)
		{
			p[j] = p[j-1];
			tmp[j] = tmp[j-1];
		}
		p[j] = temp;
		tmp[j] = i;
	}
}

void SignPlace_3(int *s,int *tmp,int n)
{
	//use more place for less time
    //O(SIZE)
	int i,j,*p;
	p = new int [SIZE];
	j = 0;
    for (i = 0;i < SIZE;++ i) p[i] = -1;
	for (i = 0;i < n;++ i) p[s[i]] = i;
	for (i = 0;i < SIZE;++ i) 
		if (p[i] != -1) tmp[j++] = p[i];
	delete [] p;
}

此算法时间复杂度达到n^2（如果SIZE<n*n的话）。

综述，实验的测试结果(n=1000)：
Simple： 2433
OutofOrder: 671
Best: 15
单位ms.

或许还有更优的算法来实现这一问题，我也在继续思考，不过，个人感觉既然是比对二元组，而二元组共有n^2级别个，时间复杂度总要在n^2以上吧？但是其他一些算法，比如最大字段和、字符串匹配等，不都越过了数量这一坎了嘛。