Goal
Download file from s3 Bucket to users computer.
Context
I am working on a Python/Flask API for a React app. When the user clicks the Download button on the Front-End, I want to download the appropriate file to their machine.
What I've tried
import boto3
s3 = boto3.resource('s3')
s3.Bucket('mybucket').download_file('hello.txt', '/tmp/hello.txt')
I am currently using some code that finds the path of the downloads folder and then plugging that path into download_file() as the second parameter, along with the file on the bucket that they are trying to download.
This worked locally, and tests ran fine, but I run into a problem once it is deployed. The code will find the downloads path of the SERVER, and download the file there.
Question
What is the best way to approach this? I have researched and cannot find a good solution for being able to download a file from the s3 bucket to the users downloads folder. Any help/advice is greatly appreciated.
You should not need to save the file to the server. You can just download the file into memory, and then build a Response
object containing the file.
from flask import Flask, Response
from boto3 import client
app = Flask(__name__)
def get_client():
return client(
's3',
'us-east-1',
aws_access_key_id='id',
aws_secret_access_key='key'
)
@app.route('/blah', methods=['GET'])
def index():
s3 = get_client()
file = s3.get_object(Bucket='blah-test1', Key='blah.txt')
return Response(
file['Body'].read(),
mimetype='text/plain',
headers={"Content-Disposition": "attachment;filename=test.txt"}
)
app.run(debug=True, port=8800)
This is ok for small files, there won't be any meaningful wait time for the user. However with larger files, this well affect UX. The file will need to be completely downloaded to the server, then download to the user. So to fix this issue, use the Range
keyword argument of the get_object
method:
from flask import Flask, Response
from boto3 import client
app = Flask(__name__)
def get_client():
return client(
's3',
'us-east-1',
aws_access_key_id='id',
aws_secret_access_key='key'
)
def get_total_bytes(s3):
result = s3.list_objects(Bucket='blah-test1')
for item in result['Contents']:
if item['Key'] == 'blah.txt':
return item['Size']
def get_object(s3, total_bytes):
if total_bytes > 1000000:
return get_object_range(s3, total_bytes)
return s3.get_object(Bucket='blah-test1', Key='blah.txt')['Body'].read()
def get_object_range(s3, total_bytes):
offset = 0
while total_bytes > 0:
end = offset + 999999 if total_bytes > 1000000 else ""
total_bytes -= 1000000
byte_range = 'bytes={offset}-{end}'.format(offset=offset, end=end)
offset = end + 1 if not isinstance(end, basestring) else None
yield s3.get_object(Bucket='blah-test1', Key='blah.txt', Range=byte_range)['Body'].read()
@app.route('/blah', methods=['GET'])
def index():
s3 = get_client()
total_bytes = get_total_bytes(s3)
return Response(
get_object(s3, total_bytes),
mimetype='text/plain',
headers={"Content-Disposition": "attachment;filename=test.txt"}
)
app.run(debug=True, port=8800)
This will download the file in 1MB chunks and send them to the user as they are downloaded. Both of these have been tested with a 40MB .txt
file.