I have a jar file on all my Hadoop nodes at /home/ubuntu/libs/javacv-0.9.jar
, with some other jar files.
When my MapReduce application is executing on Hadoop nodes, I am getting this exception
java.io.FileNotFoundException: File does not exist hdfs://192.168.0.18:50000/home/ubuntu/libs/javacv-0.9.jar
How can I resolve this exception? How can my jar running in Hadoop access 3rd party libraries from the local file system of the Hadoop node?
You need to copy your file to HDFS and not to the local filesystem.
To copy files to HDFS you need to use:
hadop fs -put localfile hdfsPath
Other option is to change the file path to:
file:///home/ubuntu/libs/javacv-0.9.jar
To add jar files to the classpath, take a look at DistributedCache:
DistributedCache.addFileToClassPath(new Path("file:///home/ubuntu/libs/javacv-0.9.jar"), job);
You may need to iterate over all jar files in that directory.
Another option would be to use distributed cache's addFileToClassPath(new Path("/myapp/mylib.jar"), job);
to submit the Jar files that should be added to the classpath of your mapper and reducer tasks.
Note: Make sure you copy the jar file to HDFS first.
You could even add jar files to class path by using hadoop command line argument -libjars <jar_file>
.
Note: Make sure your MapReduce application implements ToolRunner to allow -libjars
option from command line.