字符串统计及map排序输出。HOJ1515 Etaoin Shrdlu。

本文介绍了一个简单的字符对频率统计问题及解决方案。通过读取输入文本,统计文本中所有连续字符对的出现次数,并输出出现频率最高的五个字符对及其绝对和相对频率。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

简单题。

map简单应用。 排序后输出。

Etaoin Shrdlu


Time limit:1sec.Submitted:84
Memory limit:32MAccepted:54
Source: University of Ulm Internal Contest 2001

The relative frequency of characters in natural language texts is very important for cryptography. However, the statistics vary for different languages. Here are the top 9 characters sorted by their relative frequencies for several common languages:

English: ETAOINSHR
German:  ENIRSATUD
French:  EAISTNRUL
Spanish: EAOSNRILD
Italian: EAIONLRTS
Finnish: AITNESLOK

Just as important as the relative frequencies of single characters are those of pairs of characters, so called digrams. Given several text samples, calculate the digrams with the top relative frequencies.


Input

The input contains several test cases. Each starts with a number n on a separate line, denoting the number of lines of the test case. The input is terminated by n = 0. Otherwise, 1 <= n <= 64, and there follow n lines, each with a maximal length of 80 characters. The concatenation of these n lines, where the end-of-line characters are omitted, gives the text sample you have to examine. The text sample will contain printable ASCII characters only.


Output

For each test case generate 5 lines containing the top 5 digrams together with their absolute and relative frequencies. Output the latter rounded to a precision of 6 decimal places. If two digrams should have the same frequency, sort them in (ASCII) lexicographical order. Output a blank line after each test case.

Sample Input

2
Take a look at this!!
!!siht ta kool a ekaT
5
P=NP
 Authors: A. Cookie, N. D. Fortune, L. Shalom
 Abstract: We give a PTAS algorithm for MaxSAT and apply the PCP-Theorem [3]
 Let F be a set of clauses. The following PTAS algorithm gives an optimal
 assignment for F:
0
Sample Output
 a 3 0.073171
!! 3 0.073171
a  3 0.073171
 t 2 0.048780
oo 2 0.048780

 a 8 0.037209
or 7 0.032558
.  5 0.023256
e  5 0.023256
al 4 0.018605

说明:

给出一段text。 统计字符对出现频率最高的五项。 并降序输出。 tie时按ASCII码序排列。

代码如下:


  
#include < iostream >
#include
< stdio.h >
#include
< map >
using namespace std;

int main() {
int l;
while (cin >> l && l) {
getchar();
string str = "" ;
while (l -- ) {
string a;
getline(cin, a);
str
+= a;
}
map
< string , int > m;
for ( int i = 0 ; i < str.length() - 1 ; i ++ ) {
string tmp;
tmp
+= str[i];
tmp
+= str[i + 1 ];
m[tmp]
++ ;
}
map
< string , int > ::iterator it, maxi;
for ( int i = 0 ; i < 5 ; i ++ ) {
int max = 0 ;
for (it = m.begin(); it != m.end(); it ++ ) {
if (it -> second > max) {
max
= it -> second;
maxi
= it;
}
}
printf(
" %s %d %.6lf\n " , maxi -> first.data(), max, max * 1.0 / (str.length() - 1 ));
m.erase(maxi);
}
cout
<< endl;
}
return 0 ;
}

转载于:https://www.cnblogs.com/yangchenhao8/archive/2011/06/09/2076776.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值