copying directory from local system to hdfs java c

2019-01-25 12:22发布

问题:

I'm having a problem trying to copy a directory from my local system to HDFS using java code. I'm able to move individual files but can't figure out a way to move an entire directory with sub-folders and files. Can anyone help me with that? Thanks in advance.

回答1:

Just use the FileSystem's copyFromLocalFile method. If the source Path is a local directory it will be copied to the HDFS destination:

...
Configuration conf = new Configuration();
conf.addResource(new Path("/home/user/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/home/user/hadoop/conf/hdfs-site.xml"));

FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(new Path("/home/user/directory/"), 
  new Path("/user/hadoop/dir"));
...   


回答2:

Here is the full working code to read and write in to HDFS. It takes two arguments

  1. Input path ( local / HDFS )

  2. Output path(HDFS)

I used Cloudera sandbox.

 package hdfsread;

 import java.io.BufferedInputStream;
 import java.io.FileInputStream;
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
 import java.net.URI;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.IOUtils;

 public class ReadingAFileFromHDFS {

     public static void main(String[] args) throws IOException {
         String uri = args[0];
         InputStream in = null;
         Path pt = new Path(uri);
         Configuration myConf = new Configuration();
         Path outputPath = new Path(args[1]);

         myConf.set("fs.defaultFS","hdfs://quickstart.cloudera:8020");
         FileSystem fSystem = FileSystem.get(URI.create(uri),myConf);
         OutputStream os = fSystem.create(outputPath);
         try{
             InputStream is = new BufferedInputStream(new FileInputStream(uri));
             IOUtils.copyBytes(is, os, 4096, false);
         }
         catch(IOException e){
             e.printStackTrace();
         }
         finally{
             IOUtils.closeStream(in);
         }
     }
}


标签: java hadoop hdfs