1.计算《李尔王》一个段落Lear.txt里包含多少行、单词和字符
Poor naked wretches, wheresoe'er you are,
That bide the pelting of this pitiless storm,
How shall your houseless heads and unfed sides,
Your loop'd and window'd raggedness, defend you
From seasons such as these? O, I have ta'en
Too little care of this!标准答案是
Chars: 254
Words: 43
Lines: 6
我的解决方案:
import java.io.*;
public class CountLear {
public static void run() {
try {
BufferedReader rd = new BufferedReader(new FileReader("Lear.txt"));
int countC = 0;
int countW = 0;
int countL = 0;
while (true) {
String line = rd.readLine();
if (line == null) break;
countL++;
for (int i = 0; i < line.length() - 1; i++)
if (!Character.isWhitespace(line.charAt(i)) && Character.isWhitespace(line.charAt(i + 1)))
countW++;
countC += (line.length() + 1); //Add line feed character
countW++;
}
System.out.printf("Chars:%5d%n", countC);
System.out.printf("Words:%5d%n", countW);
System.out.printf("Lines:%5d%n", countL);
rd.close();
} catch (IOException ex) {
throw new RuntimeException(ex.toString());
}
}
public static void main(String[] args) {
run();
}
}2.将文本文件里的指定字母字符(无论大小写)删除,将结果保存在新的文本文件中,支持同时删除多个字符。
如原始文件TheWonderfulO.txt
Somewhere a ponderous tower clock slowly
dropped a dozen strokes into the gloom
Storm clouds rode low along the horizon,
and no moon shone. Only a melancholy
chorus of frogs broke the soundlessness.指定删除O后的TheWnderful.txt
Smewhere a pnderus twer clck slwly
drpped a dzen strkes int the glm
Strm cluds rde lw alng the hrizn,
and n mn shne. nly a melanchly
chrus f frgs brke the sundlessness.我的方案:
import java.util.Scanner;
import java.io.*;
public class Wonderful {
public static void run() {
String[] buffer = new String[10];
System.out.print("Input file: ");
Scanner input = new Scanner(System.in);
String rdname = input.next();
System.out.print("Output file: ");
String wtname = input.next();
System.out.print("Letters to banish: ");
String bs = input.next();
try {
BufferedReader rd = new BufferedReader(new FileReader(rdname));
int lines = 0;
while (true) {
String line = rd.readLine();
if (line == null) break;
buffer[lines++] = line;
}
rd.close();
PrintWriter wt = new PrintWriter(new BufferedWriter(new FileWriter(wtname)));
for (int i = 0; i < lines; i++)
for (int j = 0; j < bs.length(); j++) {
buffer[i] = buffer[i].replace(Character.toUpperCase(bs.charAt(j)) + "", "");
buffer[i] = buffer[i].replace(Character.toLowerCase(bs.charAt(j)) + "", "");
}
for (int i = 0; i < lines; i++)
wt.println(buffer[i]);
wt.close();
} catch (IOException ex) {
throw new RuntimeException(ex.toString());
}
}
public static void main(String[] args) {
run();
}
}解决这个问题我用到了String对象的replace方法,需要注意的是replace方法并不会改变调用它的字符串的内容。
本文介绍两个实用的文本处理案例,一是统计莎士比亚作品《李尔王》片段的字符数、单词数及行数;二是从文本中移除指定字符并保存结果。通过这两个例子展示了如何使用Java进行基本的文本分析。

被折叠的 条评论
为什么被折叠?



