i am trying to run a wordcount job in hadoop.but always getting a class not found exception.I am posting the class that i wrote and the command i using to run the job
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "WordCount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
job.setJarByClass(WordCount.class);
}
}
the wordcount.jar is exported to my downloads folder And this is the command i use to run the job
jeet@jeet-Vostro-2520:~/Downloads$ hadoop jar wordcount.jar org.gamma.WordCount /user/jeet/getty/gettysburg.txt /user/jeet/getty/out
in this case my mapreduce job is started but it is ending in the middle of the process.Printing the exception tree.
14/01/27 13:16:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/01/27 13:16:02 INFO input.FileInputFormat: Total input paths to process : 1
14/01/27 13:16:02 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/01/27 13:16:02 WARN snappy.LoadSnappy: Snappy native library not loaded
14/01/27 13:16:03 INFO mapred.JobClient: Running job: job_201401271247_0001
14/01/27 13:16:04 INFO mapred.JobClient: map 0% reduce 0%
14/01/27 13:16:11 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:16 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:20 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:26 INFO mapred.JobClient: Job complete: job_201401271247_0001
14/01/27 13:16:26 INFO mapred.JobClient: Counters: 7
14/01/27 13:16:26 INFO mapred.JobClient: Job Counters
14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=20953
14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/01/27 13:16:26 INFO mapred.JobClient: Launched map tasks=4
14/01/27 13:16:26 INFO mapred.JobClient: Data-local map tasks=4
14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
14/01/27 13:16:26 INFO mapred.JobClient: Failed map tasks=1
somebody please please help i think i am very close of it
I got this to work using
JobConf#setJar(String)
I suspect this :
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
I got the same error when using CDH4.6 and it got solved after resolving the above warning.
I also got the same issue and fixed it by removing same WordCount.class file in the same directory from where I am executing my jar. Looks like it is taking the class out side the jar. Try
try
job.setJar("wordcount.jar");
, where wordcount.jar is the jar file that you are going to package to. This method works for me, but NOTsetJarByClass
!You have to add this method
before invoking the method
As like following:
job.setJarByClass(WordCount.class); job.waitForCompletion(true);