PAT TOP 1005 Programming Pattern (35)

该博客讨论了编程习惯的模式分析,特别是在识别程序员身份和防止抄袭方面的作用。文章以PAT TOP 1005题目为例,介绍了如何找出特定长度的最常见编程模式。输入输出规格、样例输入和输出都进行了详细说明,强调了后缀数组在解决此类问题中的应用,以及在算法中采用的计数排序和基数排序等技巧。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

问题描述:

1005 Programming Pattern (35 分)

Programmers often have a preference among program constructs. For example, some may prefer if(0==a), while others may prefer if(!a). Analyzing such patterns can help to narrow down a programmer's identity, which is useful for detecting plagiarism.

Now given some text sampled from someone's program, can you find the person's most commonly used pattern of a specific length?

Input Specification:

Each input file contains one test case. For each case, there is one line consisting of the pattern length N (1≤N≤1048576), followed by one line no less than N and no more than 1048576 characters in length, terminated by a carriage return \n. The entire input is case sensitive.

Output Specification:

For each test case, print in one line the length-N substring that occurs most frequently in the input, followed by a space and the number of times it has occurred in the input. If there are multiple such substrings, print the lexicographically smallest one.

Whitespace characters in the input should be printed as they are. Also note that there may be multiple occurrences of the same substring overlapping each other.

Sample Input 1:

4
//A can can can a can.

Sample Output 1:

 can 4

Sample Input 2:

3
int a=~~~~~~~~~~~~~~~~~~~~~0;

Sample Output 2:

~~~ 19

一时模板一时爽,天天模板天天爽。。。这一题是关于后缀数组的题,套一套模板就基本能过了。。。没想到博主一贯反对的套模板AC法终于在最后一题里被用到了。。。

sa[i]数组记录 第i大的 后缀长度为k的 后缀的下标;ranks[i]数表示第i个后缀长度为k的 后缀的排名;height[i]表示排名相邻的两个后缀的最长公共前缀长度;算法中多次使用了计数排序和基数排序来降低时间。。。

AC代码:

#include<bits/stdc++.h>
using namespace std;
int n,m;
vector<int> sa,ranks,height,t1,t2;
string s;
void da()
{
	int p,t=128;
    m=s.size();
    vector<int> c(max(m,t),0);
    t1=t2=height=ranks=sa=c;
	vector<int>& x=t1;
	vector<int>& y=t2;
	for (int i=0;i<m;i++)	c[x[i]=s[i]]++;
	for (int i=1;i<t;i++)	c[i]+=c[i-1];
	for (int i=m-1;i>=0;i--)	sa[--c[x[i]]]=i;
	for (int j=1;j<=m;t=p,j<<=1)
	{
		if(j>=n)
		{
			for(int i=0;i<m;i++)	ranks[sa[i]]=i;
			break;
		}
		p=0;
		for (int i=m-j;i<m;i++)	y[p++] = i;
		for(int i=0;i<m;i++)	if(sa[i]>=j)	y[p++]=sa[i]-j;
		for(int i=0;i<t;i++)	c[i]=0;
		for(int i=0;i<m;i++)	c[x[y[i]]]++;
		for(int i=1;i<t;i++)	c[i]+=c[i-1];
		for(int i=m-1;i>=0;i--)	sa[--c[x[y[i]]]]=y[i];
		swap(x,y);
		p=1;
		x[sa[0]]=0;
		for(int i=1;i<m;i++)
		x[sa[i]]=(y[sa[i-1]]==y[sa[i]]&&y[sa[i-1]+j]==y[sa[i]+j])?p-1:p++;
	}
}
void getheight()
{
	int k=0;
	for (int i=0;i<m;height[ranks[i++]]=k) 
	{
		if(k)	k--;
		if(ranks[i]==0)	height[ranks[i]]=0;
		else
		for(int j=sa[ranks[i]-1];s[i+k]==s[j+k];) 
		{
			k++;
			if(k>=n) 
			{
				k=n;
				break;
			}
		}
	}
}
int main()
{
//	freopen("data.txt","r",stdin);
	ios::sync_with_stdio(false);
	cin>>n;
	cin.get();
	getline(cin,s);
	da();
	getheight();
	int p=sa[0],pm=sa[0],count=1,mcount=1;
	for(int i=1;i<m;i++) 
	{
		if(height[i]>=n)	count++;
		else 
		{
			if(count>mcount||(count==mcount&&s[p]<s[pm])) 
			{
				mcount=count;
				pm=p;
			}
			count=1;
			p=sa[i];
		}
	}
	if(count>mcount||(count==mcount&&s[p]<s[pm]))
	{
		mcount=count;
		pm=p;
	}
	cout<<s.substr(pm,n)<< " "<<mcount;
    return 0;
}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值