Django StreamingHttpResponse format error

2019-07-25 04:05发布

问题:

I have simple Django view, for downloading file from Amazon s3. Test by saving file locally was alright:

def some_view(request):

    res = s3.get_object(...)

    try:
        s3_file_content = res['Body'].read()
        with open("/Users/yanik/ololo.jpg", 'wb') as f:
            f.write(s3_file_content)
            # file saved and I can view it
    except:
        pass

When switch to StreamingHttpResponse I got incorrect file format (can't open) and even wrong size (If original is 317kb image the output wood be around 620kb)

def some_view(request):

    res = s3.get_object(...)

    response = StreamingHttpResponse(res['Body'].read(), content_type=res['ContentType'])
    response['Content-Disposition'] = 'attachment;filename=' + 'ololo.jpg'
    response['ContentLength'] = res['ContentLength']
    return response

Tried many different setting, but so far nothing worked for me. The output file is broken.

UPDATE

I managed to get more debuting information. If I change my file writing method in first sample from 'wb' to 'w' mode I'll have same output as with StreamingHttpResponse (first view will generate same broken file). So it looks like I must tell http header that my output is on binary format

UPDATE (get to the core of problem)

Now I'm understand the problem. But still don't have the solution. The res['Body'].read() returns bytes type and StreamingHttpResponse iterate through these bytes, which returns byte codes. So my pretty incoming bytes '...\x05cgr\xb8=:\xd0\xc3\x97U\xf4\xf3\xdc\xf0*\xd4@\xff\xd9' force converted to array like: [ ... , 195, 151, 85, 244, 243, 220, 240, 42, 212, 64, 255, 217] and then downloaded like concatenated strings. Screenshot: http://take.ms/JQztk As you see, the list elements in the end.

StreamingHttpResponse.make_bytes
"""Turn a value into a bytestring encoded in the output charset."""

回答1:

Still don't sure what is going on. But FileWrapper from https://stackoverflow.com/a/8601118/2576817 works fine for boto3 StreamingBody response type.

Welcome if someone have courage to explain this fuzzy behavior.



回答2:

https://docs.djangoproject.com/en/1.9/ref/request-response/#streaminghttpresponse-objects

StreamingHttpResponse needs an iterator. I think if your file is binary (image), then StreamingHttpResponse is not the best solution, or you should create chunks of that file.

Bytearray is an iterator but perhaps you want to go on lines not on bytes/characters.

I'm not sure if your file is line based text data, but if it is, you could create a generator in order to iterate over the file like object:

def line_generator(file_like_obj):
    for line in file_like_obj:
        yield line

and feed that generator to the StreamingHttpResponse:

some_view(request):
    res = s3.get_object(...)
    response = StreamingHttpResponse(line_generator(res['Body']), ...)
    return response