【PTA-A】1071 Speech Patterns (25 分)(map)

本文介绍了一种通过分析文本中单词出现频率来确定个人最常用词汇的方法,这对于身份验证和在线行为分析具有重要意义。文章详细阐述了算法实现步骤,包括输入字符串处理、单词频率统计及结果输出。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

People often have a preference among synonyms of the same word. For example, some may prefer "the police", while others may prefer "the cops". Analyzing such patterns can help to narrow down a speaker's identity, which is useful when validating, for example, whether it's still the same person behind an online avatar.

Now given a paragraph of text sampled from someone's speech, can you find the person's most commonly used word?

Input Specification:

Each input file contains one test case. For each case, there is one line of text no more than 1048576 characters in length, terminated by a carriage return \n. The input contains at least one alphanumerical character, i.e., one character from the set [0-9 A-Z a-z].

Output Specification:

For each test case, print in one line the most commonly occurring word in the input text, followed by a space and the number of times it has occurred in the input. If there are more than one such words, print the lexicographically smallest one. The word should be printed in all lower case. Here a "word" is defined as a continuous sequence of alphanumerical characters separated by non-alphanumerical characters or the line beginning/end.

Note that words are case insensitive.

Sample Input:

Can1: "Can a can can a can?  It can!"

Sample Output:

can 5

思路:

1.输入字符串后先全部转换为小写,利用map函数存储单词出现次数

2.遍历,定义一个空字符串,在碰到非字母和数字的符号之前,向空字符串中添加字符。碰到之后,map中该单词对应的值+1,记录最大次数,将用于记录的字符串更新为空

3.遍历map,输出最大次数时的单词和次数

注意点: 

1.输入的字符串包含空格,且题目说输入数据仅有一行,因此可用getline,不能用cin

2.在计数的时候同时记录最大次数

3.对于最后一个单词,若结尾不存在非字母和数字,则记录该单词【即用于记录的字符串不为空】

#include<iostream>
#include<string>
#include<cstring>
#include<map>
using namespace std;

string tolow(string s) {   //转换为小写
	for (int i = 0; i < s.length(); i++) {
		if (s[i] >= 'A' && s[i] <= 'Z') {
			s[i] = s[i] + 32;
		}
	}
	return s;
}
int main() {
	map<string, int>m;
	int ans = 0;
	string s;
	getline(cin, s);
	s=tolow(s);
	string str = "";
	for (int i = 0; i < s.length(); i++) {
		if ((s[i]>='a'&&s[i]<='z')||(s[i]>='0'&&s[i]<='9')) {
			str += s[i];    //若是字母或数字,记录
		}
		else {
			if (str != "")m[str]++;    //添加字符
			if (ans < m[str])ans = m[str];  //记录最大次数
			str = "";    //更新为空串
		}
	}
	if (str != "") {    //若结尾有单词
		m[str]++;
		if (ans < m[str])ans = m[str];
	}
	map<string, int>::iterator it;
	for (it = m.begin(); it != m.end(); it++) {
		if (it->second == ans) {
			cout << it->first << " " << ans;
			break;
		}
	}
	return 0;
}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值