I'm trying to upload a binary file to a Flask endpoint without using any type of multipart/form-data
. I'd like to simply POST
or PUT
the data inside the file to the endpoint, and save it to a file on the server. The only examples I can find, and the only method discussed in other questions, uses multipart/form-data
.
The following "works", but the SHA256 hashes usually don't match, whereas uploading as form-data
works fine.
@application.route("/rupload/<filename>", methods=['POST', 'PUT'])
def rupload(filename):
# Sanity checks and setup skipped.
filename = secure_filename(filename)
fileFullPath = os.path.join(UPLOAD_FOLDER, filename)
with open(fileFullPath, 'wb') as f:
f.write(request.get_data())
return jsonify({
'filename': filename,
'size': os.path.getsize(fileFullPath)
})
Additionally, the above is very inefficient with memory. Is there a way to write it to the output file via some type of buffered stream? Thanks!
Edit: This is how I'm testing this:
curl -v -H 'Content-Type: application/octet-stream' -X POST --data @test.zip https://example.com/test/rupload/test.zip
Edit: --data-binary
makes no difference.
Have you tried using hashlib?
import hashlib
...
@application.route("/rupload/<filename>", methods=['POST', 'PUT'])
def rupload(filename):
# Sanity checks and setup skipped.
filename = secure_filename(filename)
fileFullPath = os.path.join(UPLOAD_FOLDER, filename)
file_hash = hashlib.sha256()
with open(fileFullPath, 'wb+') as f:
input = request.get_data()
f.write(input)
file_hash.update(input)
...
fileDigest = file_hash.hexdigest()
The issue might be the curl command you are using. The man page recommends --data-binary: "This posts data exactly as specified with no extra processing whatsoever." The --data parameter is a synonym for --data-ascii. Probably don't need the -X parameter then either as it should default to POST.
curl -v -H 'Content-Type: application/octet-stream' -X POST --data-binary @test.zip https://example.com/test/rupload/test.zip
There are additional options on the request.get_data call that could help in case they were overridden somewhere. But looks like that should work on the server side. Disabling the cache feature might benefit your use case in particular.
f.write(request.get_data(cache=False, as_text=False, parse_form_data=False))
If it is the server side you might have to dig deeper into Werkzeug get_input_stream which is what feeds the request object.
It's good that the curl command is using the content-type header as 'application/octet-stream'. Flask doesn't appear to do anything with that in this case, but it can help a more general usage from having the data mangled by proxies or other cases.
Regarding efficiently handling large files, you might want to look at the request stream property, which is what get_data uses internally to read the data from.