A Simple Job 2016北京网赛

最新推荐文章于 2024-08-20 00:18:20 发布

原创最新推荐文章于 2024-08-20 00:18:20 发布 · 578 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#2016北京网赛 #模拟 #hihocoder

模拟同时被 3 个专栏收录

7 篇文章

订阅专栏

hihocoder

5 篇文章

订阅专栏

2016北京网赛

2 篇文章

订阅专栏

本文介绍了一个简单的编程任务：从给定的文本中找出最常出现的词组，并提供了一个实现该功能的C++代码示例。

部署运行你感兴趣的模型镜像

A Simple Job
时间限制:1000ms
单点时限:1000ms
内存限制:256MB

描述

Institute of Computational Linguistics (ICL), Peking University is an interdisciplinary institute of science and liberal arts, it focuses primarily on the fundamental researches and applications of language information processing. The research of ICL covers a wide range of areas, including Chinese syntax, language parsing, computational lexicography, semantic dictionaries, computational semantics and application systems.

Professor X is working for ICL. His little daughter Jane is 9 years old and has learned something about programming. She is always very interested in her daddy's research. During this summer vacation, she took a free programming and algorithm course for kids provided by the School of EECS, Peking University. When the course was finished, she said to Professor X: "Daddy, I just learned a lot of fancy algorithms. Now I can help you! Please give me something to research on!" Professor X laughed and said:"Ok, let's start from a simple job. I will give you a lot of text, you should tell me which phrase is most frequently used in the text."

Please help Jane to write a program to do the job.
输入

There are no more than 20 test cases.

In each case, there are one or more lines of text ended by a line of "####". The text includes words, spaces, ','s and '.'s. A word consists of only lowercase letters. Two adjacent words make a "phrase". Two words which there are just one or more spaces between them are considered adjacent. No word is split across two lines and two words which belong to different lines can't form a phrase. Two phrases which the only difference between them is the number of spaces, are considered the same.

Please note that the maximum length of a line is 500 characters, and there are at most 50 lines in a test case. It's guaranteed that there are at least 1 phrase in each test case.
输出

For each test case, print the most frequently used phrase and the number of times it appears, separated by a ':' . If there are more than one choice, print the one which has the smallest dictionary order. Please note that if there are more than one spaces between the two words of a phrase, just keep one space.
样例输入

    above,all ,above all good at good at good
    at good at above all me this is
    ####
    world hello ok
    ####

样例输出

    at good:3
    hello ok:1

依然是模拟 vector< vector< string > >vec
vec[i]保存第i段所有单词
从字母开始遇到非字母断一个单词出来存在push_back到vec.back()中
遇到非字母非空白或者换行 push_back(vector< string >())进vec中
然后穷举 map记录短语出现次数

#include<iostream>
#include<stdlib.h>
#include<stdio.h>
#include<string>
#include<vector>
#include<deque>
#include<queue>
#include<algorithm>
#include<set>
#include<map>
#include<stack>
#include<time.h>
#include<math.h>
#include<list>
#include<cstring>
#include<fstream>
#include<bitset>
//#include<memory.h>
using namespace std;
#define ll long long
#define ull unsigned long long
#define pii pair<int,int>
#define INF 1000000007

vector<vector<string> >vec;
map<string,int>mp;
vector<string>tmpvecstr;//空的vec 换行 遇到逗号等 插入的时候用

void f(string&str){
    string tmp;
    vec.push_back(tmpvecstr);//另起一段
    for(string::iterator it=str.begin();it<str.cend();)
        if(isalpha(*it)){//获取单词
            string::iterator it2=it;
            for(;isalpha(*it2);++it2);
            tmp=str;
            vec.back().push_back(tmp.assign(it,it2));
            it=it2;
        }
        else//逗号 句点等 另起一段
            if(!isspace(*it)){
                vec.push_back(tmpvecstr);
                ++it;
            }
            else//空白字符
                ++it;
}

void cau(){
//词组存mp中
    for(int i=0;i<vec.size();++i){
        vector<string>&tvec=vec[i];
        for(int j=0;j<(int)tvec.size()-1;++j)
            ++mp[tvec[j]+' '+tvec[j+1]];
    }
    string ansstr=mp.begin()->first;//记录答案
    int ansnum=mp.begin()->second;//答案出现次数
    for(map<string,int>::iterator it=mp.begin();it!=mp.end();++it)
        if(it->second>ansnum){
            ansstr=it->first;
            ansnum=it->second;
        }
        else
            if((it->second==ansnum)&&(it->first<ansstr)){
                ansstr=it->first;
            }
    cout<<ansstr<<":"<<ansnum<<endl;
}

int main()
{
    //freopen("/home/lu/文档/r.txt","r",stdin);
    //freopen("/home/lu/文档/w.txt","w",stdout);
    string str;
    while(getline(cin,str)){
        if(str=="####"){
            cau();
            vec.clear();
            mp.clear();
        }
        else
            f(str);
    }
    return 0;
}