Eclipse远程调试Hadoop接续上篇-优快云博客

本文链接：https://blog.youkuaiyun.com/wujindou/article/details/18034039

本文分享了Hadoop环境下WordCount程序的实现过程，包括解决权限、JobTracker启动等问题，并提供了一个具体的WordCount Java实现案例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

上次遇到了几个问题接着又遇到了几个问题：简单记录下来，方便以后参考：

（1）关于权限的问题关闭权限设置

<property>
        <name>dfs.permissions</name>
        <value>false</value>
        <description>
            If "true", enable permission checking in HDFS.
            If "false", permission checking is turned off,
            but all other behavior is unchanged.
            Switching from one parameter value to the other does not change the mode,
            o</description>
    </property>

（2）Job Tracker is not yet Running

查看日志可以看到错误信息，一般通过重新更改tmp.dir和重新format可以解决

（3）配置问题：

参数设置保持：hadoop.tmp.dir 与core-site.xml中一致

（4）调试的使用我指定hadoop的jar文件把 lib下，hadoop-1.2.1下jar文件jar加入编译环境

（5 ）最后的WordCount.java

 import java.io.IOException;

 import java.util.StringTokenizer;
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.Text;
 import org.apache.hadoop.mapreduce.Job;
 import org.apache.hadoop.mapreduce.Mapper;
 import org.apache.hadoop.mapreduce.Reducer;
 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
 import org.apache.hadoop.util.GenericOptionsParser;
 public class WordCount {

  public static class TokenizerMapper 
       extends Mapper<Object, Text, Text, IntWritable>{
    
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }   
    }   
  }
  
  public static class IntSumReducer 
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values,
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
		 }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    conf.set("mapred.job.tracker", "172.16.89.85:9001"); //好像是权限问题
    conf.set("mapred.jar", "D:\\software\\hadoop-1.2.1\\wordcount.jar"); //导出项目为jar包
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

添加了红色的2行。。。

（6）运行时候指定的input :hdfs://172.16.89.85:9000/input hdfs://172.16.89.85:9000/outputbu

（7）上张成果图，勉励自己及没成功的少年们。。。