标签:mapreduce
总结之前工作中遇到的一个问题。if(ttype.equals("other")){ file = (result.toString().hashCode() & 0x7FFFFFFF)%400; }else if(ttype.equals("client")){ file = (result.toString().hashCode() & 0x7FFFFFFF)%260; }else{ file = (result.toString().hashCode()& 0x7FFFFFFF)%60; } tp = new TextPair(ttype+"_"+file, result.toString()); context.write(tp, valuet);
public void reduce(TextPair key, Iterable<Text> values, Context context) throws IOException, InterruptedException { rcfileCols = getRcfileCols(key.getSecond().toString().split("\001")); context.write(key.getFirst(), rcfileCols); }
job.setOutputFormatClass(WapApacheMutiOutputFormat.class); public class WapApacheMutiOutputFormat extends RCFileMultipleOutputFormat<Text, BytesRefArrayWritable> { Random r = new Random(); protected String generateFileNameForKeyValue(Text key, BytesRefArrayWritable value, Configuration conf) { String typedir = key.toString().split("_")[0]; return typedir+"/"+key.toString(); } }
用一个MapReduce job实现去重,多目录输出功能,布布扣,bubuko.com
标签:mapreduce
原文地址:http://blog.csdn.net/cuirong1986/article/details/37654951