标签:
以IntWritable为例介绍,定制writable的步骤
//WritableComparable接口继承了writable接口和comparable接口 public class IntWritable implements WritableComparable<IntWritable> { //定义普通java类型的成员变量 private int value; //成员变量的set方法 public void set(int value) { this.value = value; } //成员变量的get方法 public int get() { return value; } //无参构造方法,为MR框架反射机制所调用 public IntWritable() {} //有参构造方法 public IntWritable(int value) { set(value); } //反序列化方法 public void readFields(DataInput in) throws IOException { value = in.readInt(); } //序列化方法 public void write(DataOutput out) throws IOException { out.writeInt(value); } //覆写equals()方法 public boolean equals(Object o) { if (!(o instanceof IntWritable)) return false; IntWritable other = (IntWritable)o; return this.value == other.value; } //覆写hashCode()方法 public int hashCode() { return value; } //覆写toString()方法 public String toString() { return Integer.toString(value); } //覆写comparable接口中的compareTo()方法【默认升序】 public int compareTo(IntWritable o) { int thisValue = this.value; int thatValue = o.value; return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1)); } // 1. 定义内部类Comparator【比较器】继承自WritableComparator类 public static class Comparator extends WritableComparator { // 2. 不可缺少的无参构造函数,反射机制调用 public Comparator() { super(IntWritable.class); } // 3. 覆写 字节流层面的比较排序 public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { //返回 字符数组b1 的编码值 int thisValue = readInt(b1, s1); int thatValue = readInt(b2, s2); return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1)); } } //向WritableComparator类注册定制的writable类【Haoop自动调用上述的比较器】 static { WritableComparator.define(IntWritable.class, new Comparator()); } }
1) 在定制Writable类中实现字节流层面的比较时,一般不直接继承RawComparator类,而是继承其子类WritableComparator,因为子类为我们提供了一些有用的工具方法,比如从字节数组中读取int、long和vlong等值。覆写 public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2)方法。
2) 当然编写完compare()方法之后,不要忘了为定制的Writable类注册编写的RawComparator类。
3) 对于代码中的readInt()工具方法的具体实现:
/** Parse an integer from a byte array. */ public static int readInt(byte[] bytes, int start) { return (((bytes[start ] & 0xff) << 24) + ((bytes[start+1] & 0xff) << 16) + ((bytes[start+2] & 0xff) << 8) + ((bytes[start+3] & 0xff))); }
标签:
原文地址:http://www.cnblogs.com/skyl/p/4732281.html