public class CountWords {
public static void main(String[] args) {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader("english.txt"));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
StringBuffer sb = new StringBuffer();
String line = null;
try {
while((line = br.readLine()) != null) {
sb = sb.append(line);
}
} catch (IOException e) {
e.printStackTrace();
}
try {
br.close();
} catch (IOException e1) {
e1.printStackTrace();
}
Pattern pattern = Pattern.compile("[a-zA-Z']+");
Matcher matcher = pattern.matcher(sb);
Map<String, Integer> map = new HashMap<String, Integer>();
String word = "";
Integer num = null;
int total = 0;
while(matcher.find()) {
word = matcher.group();
total ++;
if(map.containsKey(word)) {
num = map.get(word);
num += 1;
} else {
num = 1;
}
map.put(word, num);
}
PrintWriter pw = null;
try {
pw = new PrintWriter(new FileWriter("result.txt"), true);
} catch (IOException e) {
e.printStackTrace();
}
Iterator<String> iterator = map.keySet().iterator();
while(iterator.hasNext()) {
String tmp = iterator.next();
pw.println(tmp + " : " + map.get(tmp));
}
pw.println("total words : " + total);
pw.println("different words : " + map.size());
pw.close();
}
}
Java统计文档中英文单词个数
统计英文文本中单词出现频率并输出结果
最新推荐文章于 2024-11-13 14:00:00 发布
此博客展示了如何使用Java编程语言从英文文本文件中读取内容,统计单词频率,并将结果保存到另一个文件中。通过正则表达式过滤字母和单字符,然后遍历文本内容以计算每个单词的出现次数。最后,将统计结果输出到文件,包括总单词数、不同单词数以及每个单词及其频率。
211

被折叠的 条评论
为什么被折叠?



