MR多輸入

lcz393537415發表於2020-11-03

https://www.cnblogs.com/skyl/p/4753703.html


import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
//輸入路徑in01 in02
String in01 = "hdfs://RS5-112:9000/cs01/path01";
String in02 = "hdfs://RS5-112:9000/cs02/path02";
//多次呼叫addInputPath()方法
FileInputFormat.addInputPath(job,new Path(in0));
FileInputFormat.addInputPath(job,new Path(in1));
MultipleInputs.addInputPath(job, 
        new Path("hdfs://RS5-112:9000/cs01/path01"), 
        TextInputFormat.class,
        Mapper01.class);
MultipleInputs.addInputPath(job, 
        new Path("hdfs://RS5-112:9000/cs02/path2"), 
        KeyValueInputFormat.class,
        Mapper02.class);


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31347383/viewspace-2731910/,如需轉載,請註明出處,否則將追究法律責任。

相關文章