Hadoop Rest API for upload / download

2019-09-18 10:51发布

问题:

I am trying to perform upload/download a file from Hadoop cluster, using a C# app, but I couldn't find the APIs for Upload and download from the documentation.

So can you please let me know how to upload and download files from Hadoop using RestAPIs?

Thanks

回答1:

You can use the WebHDFS REST API as described here http://hadoop.apache.org/docs/r1.0.4/webhdfs.html

Edit:

Create and Write to a File

Step 1:

Submit a HTTP PUT request without automatically following redirects and without sending the file data.

curl -i -X PUT "http://:/webhdfs/v1/?op=CREATE [&overwrite=][&blocksize=][&replication=] [&permission=][&buffersize=]"

The request is redirected to a datanode where the file data is to be written: HTTP/1.1 307 TEMPORARY_REDIRECT Location: http://:/webhdfs/v1/?op=CREATE... Content-Length: 0

Step 2:

Submit another HTTP PUT request using the URL in the Location header with the file data to be written.

curl -i -X PUT -T "http://:/webhdfs/v1/?op=CREATE..."

The client receives a 201 Created response with zero content length and the WebHDFS URI of the file in the Location header: HTTP/1.1 201 Created Location: webhdfs://:/ Content-Length: 0

Note that the reason of having two-step create/append is for preventing clients to send out data before the redirect. This issue is addressed by the "Expect: 100-continue" header in HTTP/1.1; see RFC 2616, Section 8.2.3. Unfortunately, there are software library bugs (e.g. Jetty 6 HTTP server and Java 6 HTTP client), which do not correctly implement "Expect: 100-continue". The two-step create/append is a temporary workaround for the software library bugs.