标签:序列化 serializable writable hadoop序列化
序列化 (Serialization)将对象的状态信息转换为可以存储或传输的形式的过程(字节流)。在序列化期间,对象将其当前状态写入到临时或持久性存储区。以后,可以通过从存储区中读取或反序列化对象的状态,重新创建该对象。
通常来说有三个用途:
public interface Serializable { }如果想对某个对象进行序列化的操作,只需要在OutputStream对象上创建一个输入流 ObjectOutputStream 对象,然后调用 writeObject()。在序列化过程中,对象的类、类签名、雷瑟所有非暂态和非静态成员变量的值,以及它所有的父类都会被写入。
Date d = new Date(); OutputStream out = new ByteArrayOutputStream(); ObjectOutputStream objOut = new ObjectOutputStream(out); objOut.writeObject(d);如果想对某个基本类型进行序列化,ObjectOutputStream 还提供了多种 writeBoolean、writeByte等方法
public interface Writable { void write(DataOutput out) throws IOException; void readFields(DataInput in) throws IOException; }比如,我们需要实现一个表示某一时间段的类,就可以这样写
public class StartEndDate implements Writable{ private Date startDate; private Date endDate; @Override public void write(DataOutput out) throws IOException { out.writeLong(startDate.getTime()); out.writeLong(endDate.getTime()); } @Override public void readFields(DataInput in) throws IOException { startDate = new Date(in.readLong()); endDate = new Date(in.readLong()); } public Date getStartDate() { return startDate; } public void setStartDate(Date startDate) { this.startDate = startDate; } }
public interface RawComparator<T> extends Comparator<T> { public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2); }
class MyGrouper implements RawComparator<StartEndDate> { @Override public int compare(StartEndDate o1, StartEndDate o2) { return (int)(o1.getStartDate().getTime()- o2.getEndDate().getTime()); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { int compareBytes = WritableComparator.compareBytes(b1, s1, 8, b2, s2, 8); return compareBytes; } }然后在job中设置
@Override public boolean equals(Object obj) { if(!(obj instanceof StartEndDate)) return false; StartEndDate s = (StartEndDate)obj; return startDate.getTime()== s.startDate.getTime()&&endDate.getTime() == s.endDate.getTime(); } @Override public int hashCode() { int result = 17; //任意素数 result = 31*result +startDate.hashCode(); result = 31*result +endDate.hashCode(); return result; };
ps: equal 和 hashcode 方法中应该还要对成员变量判空,以后还需要修改。
参考资料:标签:序列化 serializable writable hadoop序列化
原文地址:http://blog.csdn.net/zq602316498/article/details/45190175