Fetching images from URL and saving on server and/

2019-09-20 09:09发布

问题:

I'm not seeing much documentation on this. I'm trying to get an image uploaded onto server from a URL. Ideally I'd like to make things simple but I'm in two minds as to whether using an ImageField is the best way or simpler to simply store the file on the server and display it as a static file. I'm not uploading anyfiles so I need to fetch them in. Can anyone suggest any decent code examples before I try and re-invent the wheel?

Given an URL say http://www.xyx.com/image.jpg, I'd like to download that image to the server, put it into a suitable location after renaming. My question is general as I'm looking for examples of what people have already done. So far I just see examples relating to uploading images, but that doesn't apply. This should be a simple case and I'm looking for a canonical example that might help.

This is for uploading an image from the user: Django: Image Upload to the Server

So are there any examples out there that just deal with the process of fetching and image and storing on the server and/or ImageField.

回答1:

Well, just fetching an image and storing it into a file is straightforward:

import urllib2
with open('/path/to/storage/' + make_a_unique_name(), 'w') as f:
    f.write(urllib2.urlopen(your_url).read())

Then you need to configure your Web server to serve files from that directory.

But this comes with security risks.

A malicious user could come along and type a URL that points nowhere. Or that points to their own evil server, which accepts your connection but never responds. This would be a typical denial of service attack.

A naive fix could be:

urllib2.urlopen(your_url, timeout=5)

But then the adversary could build a server that accepts a connection and writes out a line every second indefinitely, never stopping. The timeout doesn’t cover that.

So a proper solution is to run a task queue, also with timeouts, and a carefully chosen number of workers, all strictly independent of your Web-facing processes.

Another kind of attack is to point your server at something private. Suppose, for the sake of example, that you have an internal admin site that is running on port 8000, and it is not accessible to the outside world, but it is accessible to your own processes. Then I could type http://localhost:8000/path/to/secret/stats.png and see all your valuable secret graphs, or even modify something. This is known as server-side request forgery or SSRF, and it’s not trivial to defend against. You can try parsing the URL and checking the hostname against a blacklist, or explicitly resolving the hostname and making sure it doesn’t point to any of your machines or networks (including 127.0.0.0/8).

Then of course, there is the problem of validating that the file you receive is actually an image, not an HTML file or a Windows executable. But this is common to the upload scenario as well.