UAV10815 Andy‘s First Dictionary-优快云博客

本文链接：https://blog.youkuaiyun.com/startoveroo/article/details/115017182

本文介绍了一个简单的程序设计任务——帮助一个小男孩Andy实现他的梦想，即制作一本属于自己的字典。文章详细解释了如何通过编程从故事书中提取不同的词汇，并进行去重及排序处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Andy, 8, has a dream - he wants to produce his very own dictionary. This is not an easy task for him, as the number of words that he knows is, well, not quite enough. Instead of thinking up all the words himself, he has a briliant idea. From his bookshelf he would pick one of his favourite story books, from which he would copy out all the distinct words. By arranging the words in alphabetical order, he is done! Of course, it is a really time-consuming job, and this is where a computer program is helpful.
You are asked to write a program that lists all the different words in the input text. In this problem, a word is defined as a consecutive sequence of alphabets, in upper and/or lower case.
Words with only one letter are also to be considered. Furthermore, your program must be CaSe InSeNsItIvE. For example, words like “Apple”, “apple” or “APPLE” must be considered the same.

Input
The input file is a text with no more than 5000 lines. An input line has at most 200 characters. Input is terminated by EOF.
Output
Your output should give a list of different words that appears in the input text, one in a line. The words should all be in lower case, sorted in alphabetical order. You can be sure that he number of distinct words in the text does not exceed 5000.
Sample Input
Adventures in Disneyland
Two blondes were going to Disneyland when they came to a fork in the road. The sign read: “Disneyland Left.”
So they went home.
Sample Output
a
adventures
blondes
came
disneyland
fork
going
home
in
left
read
road
sign
so
the
they
to
two
went
were
when

AC自己敲的代码：

#include<iostream>
#include<algorithm>
#include<set>
#include<vector>
#include<string>
using namespace std;

int main(void){
	char ch;
	int i;
	string s;
	s.clear();
	vector<string> dic;
	set<string> words;
	while(scanf("%c",&ch) != EOF){
		if((ch>='A' && ch<='Z') || (ch>='a' && ch<='z')) s += ch;
		else{
			if(!s.empty()){
				for(i=0; i<s.length(); i++) s[i] = tolower(s[i]);
				if(!words.count(s)){
					words.insert(s);
					dic.push_back(s);
				}
				s.clear();
			}
		}
	}
	sort(dic.begin(),dic.end());
	for(i=0; i<dic.size(); i++){
		cout << dic[i] << endl;
	}
	return 0;
}

自己敲完之后对比了刘汝佳书上的给的代码，发现有一些更好的技巧和一些我不懂的知识点：
1.我多用了vector，因为我以为set里面不会有顺序而且不能用sort去排，但没想到set虽然是一个去重的集合，但它set.begin()到set.end()的顺序都是按照字母表顺序排列的（数字升序排列），也就是每次都Insert存进去，它自动会给你排好，根本不需要sort什么的，因此这道题根本不需要用到vector，只用set即可：

for(set<string>::iterator it = dict.begin(); it != dict.end(); ++it)
	cout << *it << "\n";

2.几个函数:
bool isapha(char); //判断是不是字母
char tolower(char); // 转换为小写
char toupper(char); // 转换为大写
bool islower(char); //判断是不是小写字母
bool isupper(char); //判断是不是大写字母

#include < iostream>就可以搞定

3.读入的问题：
1>对于scanf“%s”以及cin都是遇到空格tab键回车等等，就直接停止然后跑路的，这些空格还会留在缓存区。（不过有一点，如果接下来用scnaf%s或者cin，开头是空格，这倒不会直接结束，而是跳过空格，直到有非特殊字符开始读起）

2>gets就是完全可以读空格的用来接收字符串的，它全都吸进去，直到回车结束，而且回车也不会留在缓存区，而且带走并替换成’\0’。

3>scanf %c当然可以读入单独的空格

4>getline(cin,string,char c),这个表示cin读入string，可以读入空格，而c是一个可以写也可以不写的char型字符，表示终止符：一般不写的话终止符就是换行符，读到换行符会带走，但如果终止符是自己规定的，读到终止符就不会带走，还会存在缓冲区

5>cin.getline(char* ,int num,char c)，这个跟getline不同在于这是专门读入字符数组的，而且num可以规定读几个字符进去，另外最后的c和getline一样的作用，可写可不写，但它的这个终止符最后是会带走并转换为‘\0’的，而不是存在缓冲区

因为这道题是用EOF去结束，emem我是采用的一个字符读入，而且对c++的几个函数还不太熟，所以采用了

while(scanf("%c",&ch) != EOF)

然后我看了一下刘汝佳的方法，他是：

string s, buf;
while(cin >> s) {
	for(int i = 0; i < s.length(); i++)
		if(isalpha(s[i])) s[i] = tolower(s[i]); else s[i] = ' ';
	stringstream ss(s);
	while(ss >> buf) dict.insert(buf);
}

用cin去一个单词一个单词的读入，因为cin遇到空格会停止，所以是一个单词，但是因为input里面除了空格回车还会有‘：’以及双引号之类的符号，所以它用isalpha去判断了一下读入的是不是字母，是字母就转为小写，不是字母就转为空格，再用stringstream流读入buf。也就是输入时把所有非字母的字符变成空格，然后利用stringstream得到各个单词。

觉得我的方法还是太破了……虽然暂时还是无法熟练地用这种方法，但先记着吧……毕竟上次做有道Database的题目，被读入的地方卡住了……

如果有错误，欢迎指正交流~