copy files from FTP to HDFS

2019-06-25 12:57发布

问题:

I need to copy files from FTP sever outside my cluster to Hadoop i.e HDFS.

Thanks in advance.

回答1:

Have you tried this? FTP TO HDFS ... You can modify this code and change arguments for File to be FTP a agrs[0] and HDFS Paht to args[1]. and than run it as hadoop jar. hope this'll help...



回答2:

Have you looked at WebHDFS (http://hadoop.apache.org/docs/r1.0.4/webhdfs.html) or HttpFS (http://hadoop.apache.org/docs/r2.2.0/hadoop-hdfs-httpfs/index.html)

The services need access to the Hadoop cluster, then you could expose the HttpFS port to a server with access to the FTP server.



回答3:

Try this:

hadoop fs -get ftp://uid:password@server_url/file_path temp_file | hadoop fs -moveFromLocal tmp_file hadoop_path/dest_file



标签: hadoop ftp