标签:hive hadoop 超人学院
提供以下两种实现方式:a继承org.apache.hadoop.hive.ql.exec.UDF类
代码包为:packageorg.apache.hadoop.hive.ql.udf
实现evaluate方法,根据输入参数和返回参数类型,系统自动转换到匹配的方法实现上。
例如:
? UDFTestLength.java:
importorg.apache.hadoop.io.IntWritable;
importorg.apache.hadoop.io.Text;
public classUDFTestLength extends UDF {
IntWritable result = new IntWritable();
public IntWritable evaluate(Text s) {
if (s== null) {
return null;
}
result.set(countUDF8Characters(s));
return result;
}
}
b继承org.apache.hadoop.hive.ql.udf.generic.GenericUDF类
代码包为:packageorg.apache.hadoop.hive.ql.udf. generic
实现initialize ,evaluate, getDisplayString方法
例如:
@Description(name = "url_to_map", value = "_FUNC_(text,delimiter1, delimiter2) - "
public class GenericUDFUrlToMap extends GenericUDF{
HashMap<Object,Object> ret = new HashMap<Object, Object>();
@Override
public ObjectInspector initialize(ObjectInspector[]arguments)
throwsUDFArgumentException {
… …
returnObjectInspectorFactory.getStandardMapObjectInspector(
PrimitiveObjectInspectorFactory.javaStringObjectInspector,
PrimitiveObjectInspectorFactory.javaStringObjectInspector);
}
@Override
public Object evaluate(DeferredObject[]arguments) throws HiveException {
ret.clear();
… …
return ret;
}
@Override
public String getDisplayString(String[]children) {
StringBuildersb = new StringBuilder();
sb.append("url_to_map(");
assert (children.length <= 3);
boolean firstChild = true;
for (String child :children) {
if (firstChild) {
firstChild= false;
}else {
sb.append(",");
}
sb.append(child);
}
sb.append(")");
return sb.toString();
}
}
扩展函数写完后需要统一注册到FunctionRegistry中,发布后成为系统函数;
hive扩展函数开发规范
标签:hive hadoop 超人学院
原文地址:http://blog.csdn.net/crxy2014/article/details/46427825