标签:大数据 不同的 默认 more get 不同 map create sum
分布式应用开发,计算向数据移动
思路:
	1,客户端干了什么
		Job
	2,框架干了什么
		MapTask
		ReduceTask
	3,MR语义:
		相同的key作为一组调用一次reduce
		相同是由排序保证的
		具体的比较方法实现产生不同的排序标准
	计算向数据移动(理想状态)
		数据本地化读取
public class MyWordCount {
	public static  void main(String[] args) throws Exception {
		Configuration conf =new Configuration(true);
		Job job=Job.getInstance(conf);
		// Create a new Job
	     job.setJarByClass(MyWordCount.class);
	     // Specify various job-specific parameters     
	     job.setJobName("MG_wordcount");    
//	     job.setInputPath(new Path("in"));
//	     job.setOutputPath(new Path("out"));
	     Path input = new Path("/mg/test/test.text");
	     Path output = new Path("/mg/output");
//	     查询文件是否存在   存在就删除
	     if(output.getFileSystem(conf).exists(output)){
	    	 output.getFileSystem(conf).delete(output, true);
	     }
	     FileInputFormat.addInputPath(job, input);
	     FileOutputFormat.setOutputPath(job, output);
	     job.setMapperClass(MyMapper.class);
	     job.setReducerClass(MyReducer.class);
	     // Submit the job, then poll for progress until the job is complete
	     job.waitForCompletion(true);
		
	}
}
public class MyMapper  extends Mapper<Object, Text, Text, IntWritable>{
	private final static IntWritable one = new IntWritable(1);
	private Text word = new Text();
	public void map(Object key, Text value, Context context) throws InterruptedException, IOException {
//		确定Key的值(默认按分隔符选取,可自定义)
		StringTokenizer itr = new StringTokenizer(value.toString());
//		是否还有下个分隔符
		while (itr.hasMoreTokens()) {
			word.set(itr.nextToken());
			context.write(word, one);
		}
	}
}
public class MyReducer  extends Reducer<Text, IntWritable, Text, IntWritable>{
	private IntWritable result = new IntWritable();
	public void reduce(Text key, Iterable<IntWritable> values, Context context)
			throws IOException, InterruptedException {
		int sum = 0;
		for (IntWritable val : values) {
			sum += val.get();
		}
		result.set(sum);
		context.write(key, result);
	}
}
标签:大数据 不同的 默认 more get 不同 map create sum
原文地址:https://www.cnblogs.com/lkoooox/p/11026417.html