combine函数把一个map函数产生的<key,value>对(多个key, value)合并成一个新的<key2,value2>. 将新的<key2,value2>作为输入到reduce函数中。其格式与reduce函数相同。
例如:将3个文件中的数值相加。
file1: 1 2 3
file2: 4 5 6
file3: 7 8 9
public class MyMapre06 {
public
static class Map extends MapReduceBase implements
Mapper<LongWritable,
Text, Text, Text> {
private
Text word = new Text();
private
Text val = new Text();
public
void map(LongWritable key, Text value,
OutputCollector<Text, Text> output, Reporter reporter)
throws IOException {
String
line = value.toString();
String
bignum = new StringBuffer(line).toString();
word.set("1");
val.set(bignum);
output.collect(word,
val);
}
}
public
static class Reduce extends MapReduceBase implements
Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text,
Text> output, Reporter reporter)
throws
IOException {
BigInteger
num = BigInteger.valueOf(0);
String
tmp = new String();
Text
v = new Text();
while
(values.hasNext()) // 计算同一个key下,所有value的总和
{
tmp
= values.next().toString();
num
= num.add(new BigInteger(tmp));
}
String
res = new StringBuffer(num.toString()).toString();
v.set(res);
output.collect(key,
v); // 收集reduce输出结果
}
}
public
static class Combiner extends MapReduceBase implements
Reducer<Text, Text, Text, Text> {
public
void reduce(Text key, Iterator<Text> values,
OutputCollector<Text,
Text> output, Reporter reporter)
throws
IOException {
BigInteger
num = BigInteger.valueOf(0);
String
tmp = new String();
Text
v = new Text();
while
(values.hasNext()) // 计算同一个key下,所有value的总和
{
tmp
= values.next().toString();
num
= num.add(new BigInteger(tmp));
}
v.set(num.toString());
output.collect(key,
v); // 收集reduce输出结果
}
}
public
static void main(String[] args) throws Exception {
JobConf
conf = new JobConf(MyMapre06.class);
conf.setJobName("Sum");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Combiner.class); //使用combiner函数
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf,
new Path(args[0]));
FileOutputFormat.setOutputPath(conf,
new Path(args[1]));
JobClient.runJob(conf);
}
}
经过 Combiner函数, file1 为 6, file2 为 15, file3
为 24
进过 Reduce函数, 输出 key 为 1 value 为 35
例如:将3个文件中的数值相加。
file1: 1 2 3
file2: 4 5 6
file3: 7 8 9
public class MyMapre06 {
}
经过 Combiner函数, file1 为
进过 Reduce函数, 输出 key 为 1 value 为 35