Java IO流详解-优快云博客

流

Java中，可以从其中读入一个字节序列的对象叫做输入流，可以向其中写入一个字节序列的对象叫做输出流。字节序列的来源和目的地可以是文件，也可以是网络连接和内存块。抽象类InputStream和OutputStream是字节流类的基础。

从抽象类Reader和Writer中继承出来的类用来处理Unicode的字符。一个字符是两个字节。

InputStream和OutputStream

InputStream方法：

int read()，读入一个字节，返回读入的该字节，遇到输入源结尾返回-1.

int read(byte[] b)，读入一个字节数组，返回实际读入的字节数。

int read(byte[] b, int off, int len)

long skip(long n)，在输入流中跳过n个字节，返回实际跳过的字节数（如果碰到流的结尾，可能小于n）。

int available()，返回在不阻塞的情况下可用的字节数

read方法和OutputStream的write方法在执行时都将阻塞，直至字节确实被读入或写出。这样如果流不能被立即访问（网络原因），那么当前的线程就将阻塞。使得这个方法等待指定的流变为可用的这段时间里，其它的线程就有机会去执行有用的工作。

因此，使用available方法使我们去检查当前可用于读入的字节数量，下面这样的代码就不可能被阻塞。

InputStream is = new FileInputStream("a.txt");
		
int available = is.available();
if(available > 0) {
	byte[] data = new byte[available];
	is.read(data);
}

当完成对流的读写时，调用close关闭，否则系统资源可能被耗尽。关闭一个输出流的同时也是在清空用于该输出流的缓冲区，如果不关闭文件，那么写出字节的最后一个包可能永远得不到传递，也可以调用flush方法人为清空输出。

OutputStream方法：

void write(int n)，写出一个字节的数据

void write(byte[] b)

void write(byte[] b, int off, int len)

void flush()，清空输出流，即将所有缓冲的数据发送到目的地。

组合流过滤器

FileInputStream和FileOutputStream提供了对文件上的输入输出流。但是不能对数字进行单独的读写。

DataInputStream和DataOutputStream有读写数字类型的方法，如：

DataInputStream dis = new DataInputStream(is);
double d = dis.readDouble();

如果要想从文件中读取数字，就可以把这两种流组合起来。如下：

InputStream is = new FileInputStream("a.txt");
DataInputStream dis = new DataInputStream(is);
double d = dis.readDouble();

可以通过嵌套来添加多重的功能，如，流在默认情况下不被缓冲区缓存的，即每个对read的调用都会请求操作系统在分发一个字节。相比之下，请求一个数据块并将其至于缓冲区中会更高效，如果要使用缓冲区，可以使用下面方式：

DataInputStream dis = new DataInputStream(
new BufferedInputStream(new FileInputStream("a.txt")));

PushbackInputStream

当多个流连接在一起，需要跟踪各个中介流（intermediate stream）。当读入输入时，需要浏览下一个字节，来判断是否是想要的值。此时可以使用PushbackInputStream。

PushbackInputStream pbin = new PushbackInputStream(
                           new BufferedInputStream(
                            new FileInputStream(“a.txt”)));

现在可以预读入下一个字节，如果不是想要的，可以将其抛回。

PushbackInputStream pbin = new PushbackInputStream(
     new BufferedInputStream(new FileInputStream("a.txt")));
byte b = (byte) pbin.read();
if(b != '>') {
    pbin.unread(b);
}

如果既需要可回退(pushback)输入流的方法，又可以预先浏览，还可以读入数字，可以再包装一层DataInputStream

DataInputStream dis = new DataInputStream(pbin);
dis.readDouble();

文本输入和输出

InputStreamReader类将包含字节（某种字符编码方式表示的字符）的输入流转会为可以产生Unicode字符的读入器。

OuputStreamWriter类将使用选定的字符编码方式，把Unicode字符流转换为字节流。

以二进制格式写出数据，使用DataOutputStream

以文本格式写出数据，使用PrintWriter。

在Java1.5之前，处理文本输入的唯一方式是通过BufferedReader类，一般情况如下：

BufferedReader br = new BufferedReader(new FileReader("f.txt"));
String line = "";
while((line = br.readLine()) != null) {
     System.out.println(line);
}

但是BufferedReader没有任何用于读入数字的方法，因此建议使用Scanner来读入文本输入。Scanner是一个使用正则表达式来解析基本数据类型和字符的文本扫描器，扫描对象可以是字符串，文件，文件流等。

典型应用如下：

Scanner scanner = new Scanner(new FileReader("f.txt"));
while(scanner.hasNext()) {
    double d = scanner.nextDouble();
     System.out.println(d);
}

注意：当想要一行一行读数据时，用nextLine()，当使用nextInt()类似的时候，再此要读下一行的int数据时，先使用nextLine()，使之到下一行。

编码和解码

Charset类使用的是由IANA字符集注册中心标准化的字符集名字。可以调用静态方法forName来获得一个Charset。

Charset cset = Charset.forName("utf-8");
String str = "这是一个Unicode编码的字符串";
ByteBuffer buffer = cset.encode(str);  //把Unicode编码的字符串编码成为utf-8的字节序列
byte[] bytes = buffer.array();  //得到字节序列数组


//解码
ByteBuffer buffer2 = ByteBuffer.wrap(bytes, 0, bytes.length-1);
CharBuffer cBuf = cset.decode(buffer2);
String s2 = cBuf.toString();

读写二进制数据

DataOutput接口定义了下面用于以二进制格式写数组、字符、boolean值和字符串的方法：

writeChars writeByte writeUTF writeChar writeDouble…

例如，writeInt总是将一个整数写出为4字节的二进制数量值，不管它有多少位，writeDouble将一个double值写出为8字节的二进制数量值。虽然二进制结果非人可阅读的，但是对于给定类型的每个值，所需的空间相同，将其读回也比解析文本要更快。

DataOutput接口的实现类有DataOutputStream。

DataInput类用于读回数据，相应有一下方法

readInt readShort readLong readUTF

DataInput接口常用的实现类有DataInputStream。

随机访问文件RandomAccessFile

RandomAccessFile类可以在文件中的任何位置查找或写入数据。磁盘文件都是随机访问的，但是从网络上来的数据流不是。用法一般如下：

RandomAccessFile raf = new RandomAccessFile("a.dat", "r");
raf.seek(3);  //文件指针指到位置3
long location = raf.getFilePointer(); //得到文件指针位置

同时RandomAccessFile类实现了DataInput和DataOutput接口，可以使用readInt，writeDouble等方法。

Zip文档

ZipInputStream和ZipOutputStream是Zip压缩文件的输入流和输出流，Zip压缩包中每个文件是ZipEntry。常用方法为：

ZipInputStream zis = new ZipInputStream(
             new FileInputStream("apache-mina-2.0.2.zip"));
ZipEntry entry = null; 
while((entry = zis.getNextEntry()) != null) {
    if(!entry.isDirectory())  { //如果不是目录，则打印文件名
        System.out.println(entry.getName());
    }
    zis.closeEntry();
}
zis.close();

ZipOutputStream zos = new ZipOutputStream(
              new FileOutputStream("a.zip"));
File dir = new File("dir");
for(File f : dir.listFiles()) {
    ZipEntry en = new ZipEntry(f.getName());
    zos.putNextEntry(en);
    zos.closeEntry();
    }
zos.close();

与Zip文件相关还有ZipFile，用来表示一个Zip文件。常用方法

void close()	关闭 ZIP 文件。
Enumeration<? extends ZipEntry> entries()	返回 ZIP 文件条目的枚举。
protected void finalize()	确保不再引用此 ZIP 文件时调用它的 close 。
ZipEntry getEntry(String name)	返回指定名称的 ZIP 文件条目；如果未找到，则返回 null。
InputStream getInputStream(ZipEntry entry)	返回输入流以读取指定 ZIP 文件条目的内容。
String getName()	返回 ZIP 文件的路径名。
int size()	返回 ZIP 文件中的条目数。

对象序列化

作用：

1. 把对象的字节序列保存到硬盘上，通常放在一个文件中

2. 在网络上传送对象的字节序列

对象流ObjectInputStream和ObjectOutputStream

调用readObject()和writeObject(Object)，但是对象的类必须实现Serializable接口或者Externalizable接口，后者继承自前者，实现后者的类可以完全由自身来控制序列化行为，前者按照JDK默认方式。

如果继承Serializable接口的类，且自身定义了writeObject(ObjectOuputStream)和readObject(ObjectInputStream)方法，则ObjectOutputStream调用该类的writeObject方法来进行序列化，ObjectInputStream调用readObject来进行反序列化。

注意：writeObject和readObject方法并不是在Serializable接口中定义的。

而实现Externallizable接口的类，必须实现readExternal(ObjectInput in)和writeExternal(ObjectOutput out)，ObjectOutputStream调用该类的writeExternal来进行序列化，ObjectInputStream先通过该类的无参构造函数创建一个对象，然后调用它的readExternal方法来进行反序列化。注意：无参的构造函数必须是公开的。否则会抛出InvalidClassException

注意：使用序列化方式发送同一个对象，接收的时候，用==判断为true，因为是同一个对象。

如果对象中有的属性不想要被序列化，则用transient修饰符修饰。例子如下：

public class ObjectStreamTest {
    public static void main(String[] args) throws Exception, IOException {
        Employee e1 = new Employee("mushui", 10, 20.4);
        Manager m1 = new Manager("jingjing", 20, 50.4);
        m1.setSecrety(e1);
        Manager m2 = new Manager("j2", 30 ,  555.0);
        m2.setSecrety(e1);
        Employee[] emps = new Employee[3];
        emps[0] = e1;
        emps[1] = m1;
        emps[2] = m2;

        ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("obj.dat"));
        oos.writeObject(emps);

        ObjectInputStream ois = new ObjectInputStream(new FileInputStream("obj.dat"));
        Employee[] empsData = (Employee[])ois.readObject();

        for(Employee e : empsData) {
            System.out.println(e); //反序列化的对象不能读取transient属性
        }
    }
}


class Employee implements Serializable{
    String name;
    transient int age;  //该属性不会被序列化
    double salary;
    
public String toString() {
    return name + ":" + age + ":" + salary;
}

public Employee(String name, int age, double salary) {
    this.name = name;
    this.age = age;
    this.salary = salary;
    }
}

class Manager extends Employee {
    Employee secrety;
    
    public Manager(String string, int i, double d) {
        super(string, i, d);
    }

    public String toString() {
        return super.toString() + ":" +secrety.getName();
    }

}

如果即想要序列化，又考虑安全问题，如密码 password 属性，可以使用加密的方式来序列化，首先 password 属性也设为 transient 。

//加密数组

private byte[] change(byte[] buff) {
    for(int i=0; i<buff.length; i++) {
        int b = 0;
        for(int j=0; j<8; j++) {
            int bit = (buff[i]>>j & 1) == 0 ? 1: 0;
            b += (1<<j)*bit;
        }
        buff[i] = (byte)b;
    }
    return buff;
}

private void writeObject(ObjectOutputStream oos) throws IOException {
     oos.defaultWriteObject();  //默认方式序列化
    oos.writeObject(change(password.getBytes()));  //把密码加密后序列化
}

private void readObject(ObjectInputStream ois) throws IOException, ClassNotFoundException {
    ois.defaultReadObject();
    byte[] buff = (byte[])ois.readObject();
    password = new String(change(buff));
}

注意：默认的序列化方式会序列化整个对象图，如 Manager 类有一个属性 Employee 类的对象，则序列化 Manager 类对象的时候也会序列化 Employee 属性，而如果同时 Employee 类也有其它实现了 Serializable 接口的对象，也会一起序列化。需要递归遍历对象图。如果对象图很复杂，需要消耗很多空间和时间，甚至会导致内存溢出。

因此在复杂的对象图中，使用transient修饰符，并定义writeObject和readObject方法。

单例模式序列化会违背单例只有一个实例的初衷，如下：

public class Singleton implements Serializable{
    private static final Singleton instance = new Singleton();
    private Singleton(){
    }

    public static Singleton getInstance() {
        return instance;
    }

    public static void main(String[] args) throws Exception {
        Singleton s = Singleton.getInstance();
        ByteArrayOutputStream buf = new ByteArrayOutputStream();
        ObjectOutputStream oos  = new ObjectOutputStream(buf);
        oos.writeObject(s);
        ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(buf.toByteArray()));

        Singleton s2 = (Singleton)ois.readObject();
        System.out.println(s ==s2);
    }
}

运行结果为 false ，证明出现了 Singleton 类的两个实例。

可以在Singleton类中添加一个方法readResolve()，如下

private Object readResolve() {

return instance;

}

readResolve方法用来重新指定反序列化得到的对象，与此对应的是writeReplace用来指定被序列化的对象，方法返回一个Object对象，这个对象才是真正要被序列化的对象。

实现Externalizable接口例子：

class Emp implements Externalizable {
    private String name;
    private int age;

public void writeExternal(ObjectOutput out) throws IOException {
    out.writeObject(name);
    out.writeInt(age);
}

public void readExternal(ObjectInput in) throws IOException,
    ClassNotFoundException {
    name = (String)in.readObject();
    age = in.readInt();
}

}

对象流一般还用来做对象的深拷贝

ByteArrayOutputStream os = new ByteArrayOutputStream();
ObjectOutputStream oss = new ObjectOutputStream(os);
oss.writeObject(emps);

ByteArrayInputStream bis = new ByteArrayInputStream(os.toByteArray());
ObjectInputStream oiis = new ObjectInputStream(bis);
oiis.readObject();

File类

mkdir()方法，创建一个由这个File对象给定名字的子目录，成功返回true

mkdirs()方法，与mkdir不同，这个方法在必要时将创建父目录。

FilenameFilter接口，可以用来根据文件名来过滤文件。用法如下：

class FileNameFilterImpl implements FilenameFilter {
public boolean accept(File dir, String name) {
    return name.endsWith(".txt");
    }
}

内存映射文件

大部分操作系统可以利用虚拟内存将一个文件或者文件的一部分“映射”到内存中，这样可以把文件当作是内存数组一样访问，速度快很多。一个比较大的文件的时间对比如下：

方法	时间
随机访问文件	162s
普通输入流	110s
带缓冲的输入流	9.9s
内存映射文件	7.2s

一般步骤如下：

1. 从文件中获得一个通道Channel，通道是用于磁盘文件的一种抽象，使我们可以访问诸如内存映射、文件加锁机制以及文件间快速数据传递等操作系统特性。可以调用FileInputStream、FileOutputStream和RandomAccessFile类的getChannel方法来得到。

2. 调用FileChannel类的map方法从通道中获得一个MappedByteBuffer。可以指定想要映射的文件区域和映射模式，支持三种模式：

a) FileChannel.MapMode.READ_ONLY:缓冲区是只读的

b) FileChannel.MapMode.READ_WRITE：缓冲区是可写的，任何修改都会在某个时候写回到文件中。

c) FileChannel.MapMode.PRIVATE：缓冲区是可写的，但是任何修改对这个缓冲区来说都是私有的，不会传播到文件中。

3. 有了缓冲区，可以使用ByteBuffer类和Buffer超类的方法读写数据了。缓冲区支持顺序和随机数据访问，可以通过get和put操作来推动的位置。

一般代码如下：

FileInputStream fis = new FileInputStream("a.txt");
FileChannel fc = fis.getChannel();

int position = 0;
int size = 1024;

MappedByteBuffer  mbb = fc.map(FileChannel.MapMode.READ_WRITE, position, size);
//顺序访问
while(mbb.hasRemaining()) {
    byte b = mbb.get();
    //操作
    System.out.println(b);
}

//随机访问
for(int i=position; i<mbb.limit(); i++) {
    byte b = mbb.get(i);
    //操作
    System.out.println(b);
}

在java.util.zip包下有CRC32类，用来计算文件的32位循环冗余校验和，这个数值经常用来判断一个文件是否已损坏，因为文件的损坏可能导致校验和改变，方法为：

InputStream is = new FileInputStream(filename);
CRC32 crc = new CRC32();

int c;
while((c = is.read()) != -1) {
    crc.update(c);
}

return crc.getValue();

使用BufferedInputStream检验：

InputStream is = new BufferedInputStream(new FileInputStream(filename));

RandomAccessFile校验：

RandomAccessFile fis = new RandomAccessFile(filename, "r");
long length = fis.length();

CRC32 crc = new CRC32();

int c;
for(long l = 0; l<length; l++) {
    fis.seek(l);
    c = fis.readByte();
    crc.update(c);
}

return crc.getValue();

使用FileChannel：

FileInputStream is = new FileInputStream(filename);
FileChannel fc = is.getChannel();

int size = (int) fc.size();
MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_ONLY, 0, size);

CRC32 crc = new CRC32();

for( int i=0; i<size; i++) {
    byte b = mbb.get();
    crc.update(b);
}
return crc.getValue();

缓冲区数据结构

Buffer类是一个抽象类，子类包括ByteBuffer、CharBuffer、DoubleBuffer、IntBuffer、LongBuffer和ShortBuffer。

注意：StringBuffer与这些缓冲区没关系。

每个缓冲区都有：

1. 容量，值是固定的

2. 读写位置，下一个值将在此进行读写

3. 界限，超过它进行读写是没有意义的

4. 可选的标记，用于重复一个读入或写出操作。

这些值满足：0≤标记≤位置≤界限≤容量

转载于:https://my.oschina.net/TQNWvb/blog/159101