标签:style blog color io ar for sp div art
InputFormat接口里包括两个方法:getSplits()和createRecordReader(),这两个方法分别用来定义输入分片和读取分片的方法。
1 public abstract class InputFormat<K, V> {
2
3 /**
4 * Logically split the set of input files for the job.
5 *
6 * <p>Each {@link InputSplit} is then assigned to an individual {@link Mapper}
7 * for processing.</p>
8 *
9 * <p><i>Note</i>: The split is a <i>logical</i> split of the inputs and the
10 * input files are not physically split into chunks. For e.g. a split could
11 * be <i><input-file-path, start, offset></i> tuple. The InputFormat
12 * also creates the {@link RecordReader} to read the {@link InputSplit}.
13 *
14 * @param context job configuration.
15 * @return an array of {@link InputSplit}s for the job.
16 */
17 public abstract
18 List<InputSplit> getSplits(JobContext context
19 ) throws IOException, InterruptedException;
20
21 /**
22 * Create a record reader for a given split. The framework will call
23 * {@link RecordReader#initialize(InputSplit, TaskAttemptContext)} before
24 * the split is used.
25 * @param split the split to be read
26 * @param context the information about the task
27 * @return a new record reader
28 * @throws IOException
29 * @throws InterruptedException
30 */
31 public abstract
32 RecordReader<K,V> createRecordReader(InputSplit split,
33 TaskAttemptContext context
34 ) throws IOException,
35 InterruptedException;
36
37 }
撒发生
标签:style blog color io ar for sp div art
原文地址:http://www.cnblogs.com/gwgyk/p/3997734.html