标签:
说实在的,在阅读Hadoop YARN的源码之前,我对于java枚举的使用相形见绌。YARN中实现的事件在可读性、可维护性、可扩展性方面的工作都值得借鉴。
在具体分析源码之前,我们先看看YARN是如何定义一个事件的。比如作业启动的事件,很多人可能会用常量将它定义到一个class文件中,就像下面这样:
class Constants {
public static final String JOB_START_EVENT = "jobStart";
}或者简单的使用枚举,就像下面这样;
enum Enums {
JOB_START_EVENT("jobStart");
private String name;
private Enums(String name) {
this.name = name;
}
}之后,当增加了作业停止的事件,代码会变为:
class Constants {
public static final String JOB_START_EVENT = "jobStart";
public static final String JOB_END_EVENT = "jobEnd";
}
或者:
enum Enums {
JOB_START_EVENT("jobStart"),
JOB_END_EVENT("jobEnd");
private String name;
private Enums(String name) {
this.name = name;
}
}class Constants {
public static final String JOB_START_EVENT = "jobStart";
public static final String JOB_END_EVENT = "jobEnd";
public static final String TASK_START_EVENT = "taskStart";
public static final String TASK_END_EVENT = "taskEnd";
// 其它各种概念的常量
}
或者:
enum Enums {
JOB_START_EVENT("jobStart"),
JOB_END_EVENT("jobEnd"),
// 其它各种概念的常量枚举
TASK_START_EVENT("taskStart"),
TASK_END_EVENT("taskEnd");
private String name;
private Enums(String name) {
this.name = name;
}
}
当加入的常量值越来越多时,你会发现以上使用方式越来越不可维护。各种概念混杂在一起,显得杂乱无章。你可能会说,我不会这么傻,我会将作业与任务以及其它概念的常量值分而治之,每个业务概念相关的放入一个文件,就像下面这样:
class JobConstants {
public static final String JOB_START_EVENT = "jobStart";
public static final String JOB_END_EVENT = "jobEnd";
}class TaskConstants {
public static final String TASK_START_EVENT = "taskStart";
public static final String TASK_END_EVENT = "taskEnd";
}enum JobEnums {
JOB_START_EVENT("jobStart"),
JOB_END_EVENT("jobEnd");
private String name;
private JobEnums (String name) {
this.name = name;
}
}enum TaskEnums {
TASK_START_EVENT("taskStart"),
TASK_END_EVENT("taskEnd");
private String name;
private TaskEnums (String name) {
this.name = name;
}
}enum JobEnums {
JOB_START_EVENT(10, "jobStart"),
JOB_END_EVENT(20, "jobEnd");
private int code;
private String name;
private JobEnums (int code, String name) {
this.code = code;
this.name = name;
}
}enum TaskEnums {
TASK_START_EVENT(110, "taskStart"),
TASK_END_EVENT(120, "taskEnd");
private int code;
private String name;
private TaskEnums (int code, String name) {
this.code = code;
this.name = name;
}
}enum JobEnums {
JOB_START_EVENT(10, "jobStart", "job start description"),
JOB_END_EVENT(20, "jobEnd", "job end description");
private int code;
private String name;
private String description;
private JobEnums (int code, String name, String description) {
this.code = code;
this.name = name;
this.description = description;
}
public int hashCode() {
return this.name.hashCode() + this.description.hashCode();
}
}enum TaskEnums {
TASK_START_EVENT(110, "taskStart", 1460977775087),
TASK_END_EVENT(120, "taskEnd", 1460977775088);
private int code;
private String name;
private long timestamp;
private TaskEnums (int code, String name, long timestamp) {
this.code = code;
this.name = name;
this.timestamp = timestamp;
}
public int hashCode() {
return this.name.hashCode();
}
}事件 = 事件名称 + 事件类型
比如作业启动事件 = 作业事件 + 作业事件类型
Hadoop2.6.0中的事件多种多样,最为常见的包括:ContainerEvent、ApplicationEvent、JobEvent、RMAppEvent、RMAppAttemptEvent、TaskEvent、TaskAttemptEvent等。为了解决枚举与常量在可读性、可维护性、可复用性、可扩展性等方面的问题,Hadoop对事件进行了以下抽象:
/**
* Interface defining events api.
*
*/
@Public
@Evolving
public interface Event<TYPE extends Enum<TYPE>> {
TYPE getType();
long getTimestamp();
String toString();
}以上接口说明了任何一个具体事件都是一个枚举类型,而且有一个事件类型属性(用泛型标记TYPE表示),一个时间戳及toString()方法。
所有事件都有一个基本实现AbstractEvent,其实现如下:
/**
* Parent class of all the events. All events extend this class.
*/
@Public
@Evolving
public abstract class AbstractEvent<TYPE extends Enum<TYPE>>
implements Event<TYPE> {
private final TYPE type;
private final long timestamp;
// use this if you DON'T care about the timestamp
public AbstractEvent(TYPE type) {
this.type = type;
// We're not generating a real timestamp here. It's too expensive.
timestamp = -1L;
}
// use this if you care about the timestamp
public AbstractEvent(TYPE type, long timestamp) {
this.type = type;
this.timestamp = timestamp;
}
@Override
public long getTimestamp() {
return timestamp;
}
@Override
public TYPE getType() {
return type;
}
@Override
public String toString() {
return "EventType: " + getType();
}
}以JobEvent表示作业事件,其实现如下:/**
* This class encapsulates job related events.
*
*/
public class JobEvent extends AbstractEvent<JobEventType> {
private JobId jobID;
public JobEvent(JobId jobID, JobEventType type) {
super(type);
this.jobID = jobID;
}
public JobId getJobId() {
return jobID;
}
}TaskEvent表示任务事件,其实现如下:
/**
* this class encapsulates task related events.
*
*/
public class TaskEvent extends AbstractEvent<TaskEventType> {
private TaskId taskID;
public TaskEvent(TaskId taskID, TaskEventType type) {
super(type);
this.taskID = taskID;
}
public TaskId getTaskID() {
return taskID;
}
}
事件类型属性(用泛型标记TYPE表示)在任务事件中对应的是TaskEventType,其实现如下:
/**
* Event types handled by Task.
*/
public enum TaskEventType {
//Producer:Client, Job
T_KILL,
//Producer:Job
T_SCHEDULE,
T_RECOVER,
//Producer:Speculator
T_ADD_SPEC_ATTEMPT,
//Producer:TaskAttempt
T_ATTEMPT_LAUNCHED,
T_ATTEMPT_COMMIT_PENDING,
T_ATTEMPT_FAILED,
T_ATTEMPT_SUCCEEDED,
T_ATTEMPT_KILLED
}这种实现将枚举与各种事件之间的差异(表现在属性和方法的不同)解耦,极大地扩展了可读性、可维护性,并且保留了相同逻辑的代码复用。
后记:个人总结整理的《深入理解Spark:核心思想与源码分析》一书现在已经正式出版上市,目前京东、当当、天猫等网站均有销售,欢迎感兴趣的同学购买。

京东(现有满150减50活动)):http://item.jd.com/11846120.html
当当:http://product.dangdang.com/23838168.html
标签:
原文地址:http://blog.csdn.net/beliefer/article/details/51180261