hadoop 单机本地多输入 mapreduce

最新推荐文章于 2023-02-07 22:28:59 发布

SoLucky2017

最新推荐文章于 2023-02-07 22:28:59 发布

阅读量437

点赞数

分类专栏： hadoop

本文链接：https://blog.youkuaiyun.com/ssllkkyyaa/article/details/86503214

版权

码上代码：

建立测试环境：

创建seq 序列化文件：

/**
     * 写操作
     */
    @Test
    public void zipGzip() throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS","file:///");
        FileSystem fs = FileSystem.get(conf);
        Path p = new Path("d:/seq/1.seq") ;
        SequenceFile.Writer writer = SequenceFile.createWriter(fs,
                conf,
                p,
                IntWritable.class,
                Text.class,
                SequenceFile.CompressionType.BLOCK,
                new GzipCodec());
        for(int i = 0 ; i < 10 ; i ++){
            writer.append(new IntWritable(i),new Text("tom" + i));
            //添加一个同步点
            writer.sync();
        }
        for(int i = 0 ; i < 10 ; i ++){
            writer.append(new IntWritable(i),new Text("tom" + i));
            if(i % 2 == 0){
                writer.sync();
            }
        }
        writer.close();
    }

写文本文件：

在txt下建立1.txt 2.txt

运行：

19/01/16 10:25:52 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
19/01/16 10:25:52 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
19/01/16 10:25:54 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
19/01/16 10:25:54 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
19/01/16 10:25:54 INFO input.FileInputFormat: Total input paths to process : 1
19/01/16 10:25:54 INFO input.FileInputFormat: Total input paths to process : 2
19/01/16 10:25:54 INFO mapreduce.JobSubmitter: number of splits:3
19/01/16 10:25:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1578362493_0001
19/01/16 10:25:55 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
19/01/16 10:25:55 INFO mapreduce.Job: Running job: job_local1578362493_0001
19/01/16 10:25:55 INFO mapred.LocalJobRunner: OutputCommitter set in config null
19/01/16 10:25:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/01/16 10:25:55 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
19/01/16 10:25:55 INFO mapred.LocalJobRunner: Waiting for map tasks
19/01/16 10:25:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1578362493_0001_m_000000_0
19/01/16 10:25:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/01/16 10:25:55 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
19/01/16 10:25:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@68701d3d
19/01/16 10:25:55 INFO mapred.MapTask: Processing split: file:/d:/mr/seq/1.seq:0+928
19/01/16 10:25:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
19/01/16 10:25:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
19/01/16 10:25:55 INFO mapred.MapTask: soft limit at 83886080
19/01/16 10:25:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
19/01/16 10:25:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
19/01/16 10:25:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
19/01/16 10:25:55 WARN zlib.ZlibFactory: Failed to load/initialize native-zlib library
19/01/16 10:25:55 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
19/01/16 10:25:55 INFO mapred.LocalJobRunner:
19/01/16 10:25:55 INFO mapred.MapTask: Starting flush of map output
19/01/16 10:25:55 INFO mapred.MapTask: Spilling map output
19/01/16 10:25:55 INFO mapred.MapTask: bufstart = 0; bufend = 180; bufvoid = 104857600
19/01/16 10:25:55 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214320(104857280); length = 77/6553600
19/01/16 10:25:55 INFO mapred.MapTask: Finished spill 0
19/01/16 10:25:55 INFO mapred.Task: Task:a