I've searched for some time now and none of the solutions seem to work for me.
Pretty straightforward - I want to upload data from my local file system to HDFS using the Java API. The Java program will be run on a host that has been configured to talk to a remote Hadoop cluster through shell (i.e. hdfs dfs -ls
, etc.).
I have included the below dependencies in my project:
hadoop-core:1.2.1
hadoop-common:2.7.1
hadoop-hdfs:2.7.1
I have code that looks like the following:
File localDir = ...;
File hdfsDir = ...;
Path localPath = new Path(localDir.getCanonicalPath());
Path hdfsPath = new Path(hdfsDir.getCanonicalPath());
Configuration conf = new Configuration();
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
Filesystem fs = FileSystem.get(configuration);
fs.getFromLocalFile(localPath, hdfsPath);
The local data is not being copied to the Hadoop cluster, but no errors are reported and no exceptions are thrown. I've enabled TRACE
logging for the org.apache.hadoop
package. I see the following outputs:
DEBUG Groups:139 - Creating new Groups object
DEBUG Groups:139 - Creating new Groups object
DEBUG Groups:59 - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
DEBUG Groups:59 - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
DEBUG UserGroupInformation:147 - hadoop login
DEBUG UserGroupInformation:147 - hadoop login
DEBUG UserGroupInformation:96 - hadoop login commit
DEBUG UserGroupInformation:96 - hadoop login commit
DEBUG UserGroupInformation:126 - using local user:UnixPrincipal: willra05
DEBUG UserGroupInformation:126 - using local user:UnixPrincipal: willra05
DEBUG UserGroupInformation:558 - UGI loginUser:<username_redacted>
DEBUG UserGroupInformation:558 - UGI loginUser:<username_redacted>
DEBUG FileSystem:1441 - Creating filesystem for file:///
DEBUG FileSystem:1441 - Creating filesystem for file:///
DEBUG FileSystem:1290 - Removing filesystem for file:///
DEBUG FileSystem:1290 - Removing filesystem for file:///
DEBUG FileSystem:1290 - Removing filesystem for file:///
DEBUG FileSystem:1290 - Removing filesystem for file:///
Can anyone assist in helping me resolve this issue?
EDIT 1: (09/15/2015)
I've removed 2 of the Hadoop dependencies - I'm only using one now:
hadoop-core:1.2.1
My code is now the following:
File localDir = ...;
File hdfsDir = ...;
Path localPath = new Path(localDir.getCanonicalPath());
Path hdfsPath = new Path(hdfsDir.getCanonicalPath());
Configuration conf = new Configuration();
fs.getFromLocalFile(localPath, hdfsPath);
I was previously executing my application with the following command:
$ java -jar <app_name>.jar <app_arg1> <app_arg2> ...
Now I'm executing it with this command:
$ hadoop jar <app_name>.jar <app_arg1> <app_arg2> ...
With these changes, my application now interacts with HDFS as intended. To my knowledge, the hadoop jar
command is meant only for Map Reduce jobs packaged as an executable jar, but these changes did the trick for me.