1.Hadoop数据类型如下图:
由上图的Writable层次结构图可以看到绝大多数的数据类型都实现了Writable、WritableComparable接口,在此先分析一下这两个接口情况。自顶下下逐步分析。
Writable接口的定义如下:
- <span style="font-family:SimSun;font-size:14px;">package org.apache.hadoop.io;
- import java.io.DataOutput;
- import java.io.DataInput;
- import java.io.IOException;
- public interface Writable {
- /*
- object将自身字段序列化后的的字节流写入输出流out中。
- 参数:
- out - 接收object序列化后的字节流的输出流.
- */
- void write(DataOutput out) throws IOException;
- /*
- 将输入流in中的字节流反序列化然后写入object的字段
- 参数:
- 字节流的出处
- */
- void readFields(DataInput in) throws IOException;
- }</span>
WritableComparable接口定义如下:
- <span style="font-family:SimSun;font-size:14px;">package org.apache.hadoop.io;
- public interface WritableComparable<T> extends Writable, comparable<T> {
- }</span>
- <span style="font-family:SimSun;font-size:14px;">void write(DataOutput out) throws IOException;
- void readFields(DataInput in) throws IOException;</span>
还有来自comparable的方法,comparable是属于java.lang.*中的一个接口,它只有一个方法。
- <span style="font-family:SimSun;font-size:14px;">int compareTo( T other);
- /*
- 比较此对象与指定对象other的顺序。如果该对象小于、等于或大于指定对象,则分别返回负整数、零或正整数。
- 参数:o - 要比较的对象。
- 返回:负整数、零或正整数,根据此对象是小于、等于还是大于指定对象。
- */</span>
2.IntWritable类定义如下:
- <span style="font-family:SimSun;font-size:14px;">package org.apache.hadoop.io;
- import java.io.*;
- /** A WritableComparable for ints. */
- public class IntWritable implements WritableComparable {
- private int value;
- public IntWritable() {}
- public IntWritable(int value) { set(value); }
- /** Set the value of this IntWritable. */
- public void set(int value) { this.value = value; }
- /** Return the value of this IntWritable. */
- public int get() { return value; }
- public void readFields(DataInput in) throws IOException {
- value = in.readInt();
- }
- public void write(DataOutput out) throws IOException {
- out.writeInt(value);
- }
- /** Returns true iff o is a IntWritable with the same value. */
- public boolean equals(Object o) {
- if (!(o instanceof IntWritable))
- return false;
- IntWritable other = (IntWritable)o;
- return this.value == other.value;
- }
- public int hashCode() {
- return value;
- }
- /** Compares two IntWritables. */
- public int compareTo(Object o) {
- int thisValue = this.value;
- int thatValue = ((IntWritable)o).value;
- return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
- }
- public String toString() {
- return Integer.toString(value);
- }
- /** A Comparator optimized for IntWritable. */
- public static class Comparator extends WritableComparator {
- public Comparator() {
- super(IntWritable.class);
- }
- public int compare(byte[] b1, int s1, int l1,
- byte[] b2, int s2, int l2) {
- int thisValue = readInt(b1, s1);
- int thatValue = readInt(b2, s2);
- return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
- }
- }
- static { // register this comparator
- WritableComparator.define(IntWritable.class, new Comparator());
- }
- }</span>
3.一般对于自定义数据类型要实现Writable接口,因为数据在网络传输或者进行永久性存储的时候,需要序列化和反序列化。如果该数据类型要作为主键使用或者要进行比较大小的操作,还要实现WritableComparable接口。
如:
- <span style="font-family:SimSun;font-size:14px;">public class Point3D implements WritableComparable<Point3D>
- {
- private float x,y,z;
- public float getX(){return x;}
- public float getY(){return y;}
- public float getZ(){return z;}
- public void readFields(DataInput in) throws IOException
- {
- x = in.readFloat();
- y = in.readFloat();
- z = in.readFloat();
- }
- public void write(DataOutput out) throws IOException
- {
- out.writeFloat(x);
- out.writeFloat(y);
- out.writeFloat(z);
- }
- public int CompareTo(Point3D p)
- {
- //具体实现比较当前的空间坐标点this(x,y,z)与指定的点p(x,y,z)的大小
- // 并输出: -1(小于), 0(等于), 1(大于)
- }
- }</span>