首页 > 移动开发 > 详细

MapReduce源码分析：Mapper和Reducer类

时间：2015-08-07 14:50:29 阅读：187 评论：0 收藏：0 [点我收藏+]

标签：源码 hadoop mapreduce

一：Mapper类

在Hadoop的mapper类中，有4个主要的函数，分别是：setup，clearup，map，run。代码如下：

protected void setup(Context context) throws IOException, InterruptedException {
// NOTHING
}
protected void map(KEYIN key, VALUEIN value,
Context context) throws IOException, InterruptedException {
context.write((KEYOUT) key, (VALUEOUT) value);
}
protected void cleanup(Context context) throws IOException, InterruptedException {
// NOTHING
}
public void run(Context context) throws IOException, InterruptedException {
setup(context);
while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
cleanup(context);
}
}

由上面的代码，我们可以了解到，当调用到map时，通常会先执行一个setup函数，最后会执行一个cleanup函数。而默认情况下，这两个函数的内容都是nothing。因此，当map方法不符合应用要求时，可以试着通过增加setup和cleanup的内容来满足应用的需求。

二：Reducer类

在Hadoop的reducer类中，有3个主要的函数，分别是：setup，clearup，reduce。代码如下：

/**
* Called once at the start of the task.
*/
protected void setup(Context context
) throws IOException, InterruptedException {
// NOTHING
}

/**
* This method is called once for each key. Most applications will define
* their reduce class by overriding this method. The default implementation
* is an identity function.
*/
@SuppressWarnings("unchecked")
protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
) throws IOException, InterruptedException {
for(VALUEIN value: values) {
context.write((KEYOUT) key, (VALUEOUT) value);
}
}

/**
* Called once at the end of the task.
*/
protected void cleanup(Context context
) throws IOException, InterruptedException {
// NOTHING
}

在用户的应用程序中调用到reducer时，会直接调用reducer里面的run函数，其代码如下：

/*
* control how the reduce task works.
*/
@SuppressWarnings("unchecked")
public void run(Context context) throws IOException, InterruptedException {
setup(context);
while (context.nextKey()) {
reduce(context.getCurrentKey(), context.getValues(), context);
// If a back up store is used, reset it
((ReduceContext.ValueIterator)
(context.getValues().iterator())).resetBackupStore();
}
cleanup(context);
}
}

由上面的代码，我们可以了解到，当调用到reduce时，通常会先执行一个setup函数，最后会执行一个cleanup函数。而默认情况下，这两个函数的内容都是nothing。因此，当reduce不符合应用要求时，可以试着通过增加setup和cleanup的内容来满足应用的需求。

版权声明：本文为博主原创文章，未经博主允许不得转载。

MapReduce源码分析：Mapper和Reducer类

标签：源码 hadoop mapreduce

原文地址：http://blog.csdn.net/gamer_gyt/article/details/47338053

踩

(0)

赞

(0)

举报

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

更多

友情链接

兰亭集智国之画百度统计站长统计阿里云 chrome插件新版天听网

关于我们 - 联系我们 - 留言反馈

© 2014 mamicode.com 版权所有联系我们:gaon5@hotmail.com

迷上了代码！