【问题描述】
打开一英文文章(保存在一个现有文件in.txt中),为该文件生成词汇表(存到另一个文件out.txt中),要求词汇表中的单词以字典序由小到大存放(只由连续字母组成,且全为小写字母,不重复)。
假设:
1、该文章有可能没有经过排版,格式有可能杂乱无章,也有可能没有写完整。
2、文章中的单词个数不超过1000个,每个单词的长度不超过50个字母。
【输入形式】
保存英文文章的文件in.txt位于当前目录下。
【输出形式】
将生成的词汇表以字典序由小到大输出到当前目录下的文件out.txt中,每个单词单独占一行。
【样例输入1】
假设文件in.txt内容为:
There are two versions of the international standards for C.
The first version was ratified in 1989 by the American National
Standards Institue (ANSI) C standard committee.It is often
referred as ANSI C or C89. The secand C standard was completed
in 1999. This standard is commonly referred to as C99. C99 is a
milestone in C’s evolution into a viable programming language
for numerical and scientific computing.
【样例输出1】
文件out.txt中的词汇表应为:
a
american
and
ansi
are
as
by
c
committee
commonly
completed
computing
evolution
first
for
in
institue
international
into
is
it
language
milestone
national
numerical
of
often
or
programming
ratified
referred
s
scientific
secand
standard
standards
the
there
this
to
two
version
versions
viable
was
【样例输入2】
假设文件in.txt内容为:
There are two versions of the international standards for
【样例输出2】
文件out.txt中的词汇表应为:
are
for
international
of
standards
the
there
two
versions
【样例说明】
将in.txt中出现的所有由字母组成的单词(出现的单个字母也作为单词)全部转换成小写后,按照字典序输出到文件out.txt中(每个单词独占一行,重复出现的只输出一次)。
注意:样例2中in.txt文件内容还不完整,单词for后没有任何字符。
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
FILE *f, *g;
char word[50], c, tmp[50];
char words[1000][50];
int i, t,j = 0, k = 0;
f = fopen("in.txt", "r");
g = fopen("out.txt", "w");
while (!feof(f)) {
fscanf(f, "%s", words[k++]);
}
rewind(f);
for (i = 0; i < k; i++) {
for (j = 0; j < strlen(words[i]); j++) {
c = words[i][j];
if ((c <= 'z'&&c >= 'a') || (c <= 'Z'&&c >= 'A')) {
if ((c <= 'Z'&&c >= 'A')) {
c += 32;
}
words[i][j] = c;
}
else if(c=='\''){
t = 0;
int a=j;
while (words[i][a++] != ' ') {
words[k][t++] = words[i][a];
}
k++;
}
else {
words[i][j] = '\0';
}
}
}
for (i = 0; i < k - 1; i++) {
for (j = i + 1; j < k; j++) {
if (strcmp(words[i], words[j]) > 0) {
strcpy(word, words[i]);
strcpy(words[i], words[j]);
strcpy(words[j], word);
}
}
}
for (i = 0; i < k; i++) {
while ((strcmp(words[i], words[i + 1]) == 0)||(words[i][0]=='\0')) {
i++;
}
fprintf(g, "%s\n", words[i]);
}
fclose(f);
fclose(g);
return 0;
}
该博客介绍了如何处理英文文章,生成按字典序排列的词汇表。内容包括从可能未经排版的文章中提取单词,对单词进行排序,并考虑文章可能的不完整性和单词数量限制。提供两个样例展示输入输出格式。
197





