Debug MapReduce (of Hadoop 2.2 or higher) in Eclip

I am able to debug MapReduce (of Hadoop 1.2.1) in Eclipse by following the steps in http://www.thecloudavenue.com/2012/10/debugging-hadoop-mapreduce-program-in.html. But how do I debug MapReduce (of Hadoop 2.2 or higher) in Eclipse?

标签： eclipse debugging hadoop mapreduce

2条回答

干净又极端

2楼-- · 2019-07-17 01:37

You can debug in same way. You just run you MapReduce code in standalone mode and use eclipse to debug MR code like any Java code.

0人赞添加讨论(0) 举报

老娘就宠你

3楼-- · 2019-07-17 01:47

Here are the steps I setup in Eclipse. Environment: Ubuntu 16.04.2, Eclipse Neon.3 Release (4.6.3RC2), jdk1.8.0_121. I did a fresh hadoop-2.7.3 installation under /j01/srv/hadoop, which is my $HADOOP_HOME. Replace $HADOOP_HOME value with your actual path wherever referenced below. For hadoop running from Eclipse, you do not need to do any hadoop configurations, what really needed is to pull the right set of hadoop jars into Eclipse.

Step 1 Create new Java Project
File > New > Project...
Select Java Project, Next

Enter Project name: hadoopmr

Click Configure default...

Source folder name: src/main/java
Output folder name: target/classes
Click Apply, OK, then Next
Click tab Libraries

Click Add External JARs...
Browse to hadoop installation folder, and add the following jars, when done click Finish

$HADOOP_HOME/share/hadoop/common/hadoop-common-2.7.3.jar
$HADOOP_HOME/share/hadoop/common/hadoop-nfs-2.7.3.jar

$HADOOP_HOME/share/hadoop/common/lib/avro-1.7.4.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-collections-3.2.2.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-configuration-1.6.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-io-2.4.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-lang-2.6.jar
$HADOOP_HOME/share/hadoop/common/lib/commons-logging-1.1.3.jar
$HADOOP_HOME/share/hadoop/common/lib/hadoop-auth-2.7.3.jar
$HADOOP_HOME/share/hadoop/common/lib/httpclient-4.2.5.jar
$HADOOP_HOME/share/hadoop/common/lib/httpcore-4.2.5.jar
$HADOOP_HOME/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar
$HADOOP_HOME/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar
$HADOOP_HOME/share/hadoop/common/lib/log4j-1.2.17.jar
$HADOOP_HOME/share/hadoop/common/lib/slf4j-api-1.7.10.jar
$HADOOP_HOME/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar

$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.3.jar
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.3.jar
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3.jar
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.3.jar
$HADOOP_HOME/share/hadoop/mapreduce/lib-examples/hsqldb-2.0.0.jar

$HADOOP_HOME/share/hadoop/tools/lib/guava-11.0.2.jar
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-api-2.7.3.jar
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-common-2.7.3.jar

Step 2 Create a MapReduce example
Create a new package: org.apache.hadoop.examples
Create WordCount.java under package org.apache.hadoop.examples with the following contents:

/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.hadoop.examples;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

  public static class TokenizerMapper 
       extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }

  public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values, 
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length < 2) {
      System.err.println("Usage: wordcount <in> [<in>...] <out>");
      System.exit(2);
    }
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    for (int i = 0; i < otherArgs.length - 1; ++i) {
      FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
    }
    FileOutputFormat.setOutputPath(job,
      new Path(otherArgs[otherArgs.length - 1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

Create input.txt under /home/hadoop/input/ (or your path) with the following contents:

What do you mean by Object
What is Java Virtual Machine
How to create Java Object
How Java enabled High Performance

Step 3 Setup Debug Configuration
In Eclipse, open WordCount.java, set breakpoints in places you like.
Right click on WordCount.java, Debug As > Debug Configurations...
Select Java Application, click New launch configuration on top-left icon

Enter org.apache.hadoop.examples.WordCount in Main class box
Click Arguments tab

enter

/home/hadoop/input/input.txt /home/hadoop/output

into Program arguments
Click Apply, then Debug
Program starts along with hadoop, it should hit the breakpoints you set.

Check results at

ls -l /home/hadoop/output
-rw-r--r-- 1 hadoop hadoop 131 Apr  5 22:59 part-r-00000
-rw-r--r-- 1 hadoop hadoop   0 Apr  5 22:59 _SUCCESS

Notes:

1) If program does not run, make sure Project > Build Automatically is checked.
Project > Clean… to force a build

2) You can get more examples from

jar xvf $HADOOP_HOME/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.3-sources.jar

Copy them into this project to continue explore

3) You can download this eclipse project from

git clone https://github.com/drachenrio/hadoopmr

In Eclipse, File > Import... > Existing Projects into Workspace > Next
Browse to cloned project and import it
Open .classpath, replace /j01/srv/hadoop-2.7.3 with your hadoop installation home

0人赞添加讨论(0) 举报

Debug MapReduce (of Hadoop 2.2 or higher) in Eclip

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间