I want to allow registered users of a website (PHP) to upload files (documents), which are going to be publicly available for download. In this context, is the fact that I keep the file's original name a vulnerability ? If it is one, I would like to know why, and how to get rid of it.
问题:
回答1:
That depends where you store the filename. If you store the name in a database, in strictly typed variable, then HTML encode before you display it on a web page, there won't be any issues.
回答2:
The name of the files could reveal potentially sensitive information. Some companies/people use different naming conventions for documents, so you might end up with :
- Author name ( court-order-john.smith.doc )
- Company name ( sensitive-information-enterprisename.doc )
- File creation date ( letter.2012-03-29.pdf )
I think you get the point, you can probably think of some other information people use in their filenames.
Depending on what your site is about this could become an issue (consider if wikileaks published leaked documents that had the original source somewhere inside the filename).
If you decide to hide the filename, you must consider the problem of somebody submitting an executable as a document, and how you make sure people know what they are downloading.
回答3:
While this is an old question, it's surprisingly high on the list of search results when looking for 'security file names', so I'd like to expand on the existing answers:
Yes, it's almost surely a vulnerability.
There are several possible problems you might encounter if you try to store a file using its original filename:
- the filename could be a reserved or special file name. What happens if a user uploads a file called
.htaccess
that tells the webserver to parse all.gif
files as PHP, then uploads a.gif
file with a GIF comment of<?php /* ... */ ?>
? - the filename could contain
../
. What happens if a user uploads a file with the 'name'../../../../../etc/cron.d/foo
? (This particular example should be caught by system permissions, but do you know all locations that your system reads configuration files from?)- if the user the web server runs as (let's call it
www-data
) is misconfigured and has a shell, how about../../../../../home/www-data/.ssh/authorized_keys
? (Again, this particular example should be guarded against by SSH itself (and possibly the folder not existing), since theauthorized_keys
file needs very particular file permissions; but if your system is set up to give restrictive file permissions by default (tricky!), then that won't be the problem.)
- if the user the web server runs as (let's call it
- the filename could contain the
x00
byte, or control characters. System programs may not respond to these as expected - e.g. a simplels -al | cat
(not that I know why you'd want to execute that, but a more complex script might contain a sequence that ultimately boils down to this) might execute commands. - the filename could end in
.php
and be executed once someone tries to download the file. (Don't try blacklisting extensions.)
The way to handle this is to roll the filenames yourself (e.g. md5()
on the file contents or the original filename). If you absolutely must allow the original filename to best of your ability, whitelist the file extension, mime-type check the file, and whitelist what characters can be used in the filename.
Alternatively, you can roll the filename yourself when you store the file and for use in the URL that people use to download the file (although if this is a file-serving script, you should avoid letting people specify filenames here, anyway, so no one downloads your ../../../../../etc/passwd
or other files of interest), but keep the original filename stored in the database for display somewhere. In this case, you only have SQL injection and XSS to worry about, which is ground that the other answers have already covered.