I am trying to run word count example on Eclipse. Generally when we click on "run on hadoop" option in eclipse we get a new window asking to select server location. But, now it is directly running the program without asking me to choose an existing server from list below.
I think because of this I am getting the following exception:
13/04/21 08:46:31 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser1 cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hduser1/gutenbergIP/pg4300.txt
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hduser1/gutenbergIP/pg4300.txt
My code works if I change the line from:
FileInputFormat.setInputPaths(conf, "/home/hduser1/gutenbergIP/pg4300.txt");
to:
FileInputFormat.setInputPaths(conf, "hdfs://localhost:54310/home/hduser1/gutenbergIP/pg4300.txt");
If I explicitly specify the file name with full url it is working. I wanted some help regarding this. How can I make my relative URL work instead of giving a full URL(have to submit this as assignment in school).
Add the following 2 lines in your code :
If you don't specify this your client will look into the local FS, which doesn't contain the specified path, hence throwing that error.