DNA Sorting
Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)Total Submission(s): 2379 Accepted Submission(s): 1168
Problem Description
One measure of ``unsortedness'' in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence ``DAABEC'', this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence ``AACEDGG'' has only one inversion (E and D)--it is nearly sorted--while the sequence ``ZWQM'' has 6 inversions (it is as unsorted as can be--exactly the reverse of sorted).
You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
This problem contains multiple test cases!
The first line of a multiple input is an integer N, then a blank line followed by N input blocks. Each input block is in the format indicated in the problem description. There is a blank line between input blocks.
The output format consists of N output blocks. There is a blank line between output blocks.
You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
This problem contains multiple test cases!
The first line of a multiple input is an integer N, then a blank line followed by N input blocks. Each input block is in the format indicated in the problem description. There is a blank line between input blocks.
The output format consists of N output blocks. There is a blank line between output blocks.
Input
The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (1 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.
Output
Output the list of input strings, arranged from ``most sorted'' to ``least sorted''. If two or more strings are equally sorted, list them in the same order they are in the input file.
Sample Input
1 10 6 AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT
Sample Output
CCCGGGGGGA AACATGAAGG GATCAGATTT ATCGATGCAT TTTTGGCCAA TTTGGCCAAA
题目大意:
刚开始看了半天也没看懂什么意思;只知道
“DAABEC‘中的测量数时五个,这五个怎么算的呢,比如’D‘是排在第一位的它后面又四个字母比他小分别是’AABC‘,然后是’E‘它后面有一个字母’C‘比他小,那么它的测量数就是4+1=5;后面给的是一样的;
所以就直接拿题目给的案例看看到底是个什么情况,才发现原来是这样恩;看代码部分的注释有:
才知道题目是要我们求所有逆序数(也就是测量数)的总数的排序;
分析:
知道题目的目的是排序了,那么很简单了,直接排序得到正确结果输出就好了;但是要注意,如果出现两个字符串的逆序数时一样的,怎么办呢。没问题,用冒泡或者选择排序,很稳定的排序直接AC;
还有一点要注意的就是,题目中是考虑到DNA的逆序数的排序 所以它给的字母只有 'A C G T'四种;
给出AC代码:
/*
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A C G T
字符 逆序数个数;
AACATGAAGG 3+5+2 10
TTTTGGCCAA 6+6+6+6+4+4+2+2 36
TTTGGCCAAA 7+7+7+5+5+3+3 37
GATCAGATTT 4+4+2+1 11
CCCGGGGGGA 1+1+1+6 9
ATCGATGCAT 6+2+3+3+2+1 17
排序后得到的答案 和题目所给的输出答案一样;
CCCGGGGGGA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA
很好的一组测试数据,可以看出sort()和 stable_sort()的差别;
虽然我用sort提交也AC(当然是做了处理的,当有相同的两项比较时的不稳定现象做了处理 cmp函数中return x.count <= y.count; ),但是下面
给的这组测试数据会出问题的;但他能够AC,显然题目所给的数据不够严格;
当然为了处理不出现二义性,就是不出现 return x.count <= y.count;时的错误,我们直接将cmp函数改成这样:
bool cmp(DNA x,DNA y)
{
if (x.count != y.count)
return x.count < y.count;
return false;
}
没问题 一样可以AC而且下面的这组数据也是可以过的 ;
用stable_sort()提交自然没问题,冒泡排序忽略了这种不稳定性,冒泡排序速率上虽然慢点,但是稳定性高;
1
10 6
AAGATAACGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT
*/
#include<iostream>
#include<algorithm>
#include<string>
using namespace std;
struct DNA
{
string str;
int count=0;
}va[105];
bool cmp(DNA x,DNA y)
{
if (x.count != y.count) //去除sort函数排序带来的不稳定性,也就是相对位置发生的变化;
return x.count < y.count;
return false;
//return x.count <= y.count;
//return x.count < y.count; //使用冒泡排序可以用这个;
}
int main()
{
int t;
int len;
int n, m;
cin >> t;
while (t--)
{
cin >> n >> m;
for (int i = 0; i < m; i++)
{
cin >> va[i].str;
}
len = va[0].str.length();
for (int k = 0; k < m; k++)
{
for (int i = 0; i < len - 1; i++)
{
for (int j = i + 1; j < n; j++)
{
if (va[k].str[i]>va[k].str[j])va[k].count++;
}
}
}
}
sort(va, va + m, cmp);
//stable_sort(va, va + m, cmp);//冒泡排序;
for (int i = 0; i < m; i++)
cout << va[i].str << endl;
return 0;
}