i'm new to PHP, and i'm trying to upload file to file server and file information to mysql database, i have done uploading file server and database part but i need to retrieve the info of specific file from my file server folder if i click that file, i'm trying get that logic. please help me if there is any solid solution for this. (correct me if i'm wrong, my idea was to upload the file path to database along with info, is this will give me solution? but the filename can be duplicate)
问题:
回答1:
I figured I would write a short(for me this is short) "answer" just so I could summarize my points.
Some "Best Practices" when creating a file storage system. File storage is a broad category so your mileage may vary for some of these. Take them just as suggestion of what I found works well.
Filenames
Don't store the file with the name give it by an end user. They can and will use all kind of crappy characters that will make your life miserable. Some can be as bad as '
single quotes, which on linux basically makes it so it's impossible to read, or even delete the file ( directly ). Some things can seem simple like a space but depending on where you use it and the OS on your server you could wind up with
one%20two.txt
or one+two.txt
or one two.txt
which may or may not create all kinds of issues in your links.
The best thing to do is create a hash, something like sha1
this can be as simple as {user_id}{orgianl_name}
The username make it less likely of collisions with other users filenames.
I prefer doing file_hash('sha1', $contents)
that way if someone uploads the same file more then once you can catch that ( the contents are the same the hash is the same). But if you expect to have large files you may want to do some bench marking on it to see what type of performance it has. I mostly handle small files so it works fine for that.
-note- that with the timestamp the file can still be saved because the full name is different, but it makes it quite easy to see, and it can be verified in the database.
Regardless of what you do I would prefix it with a timestamp time().'-'.$filename
. This is useful information to have, because its the absolute time the file was created.
As for the name a user give the file. Just store that in the database record. This way you can show them the name they expect, but use a name you know is always safe for links.
$filename = 'some crapy^ fileane.jpg';
$ext = strrchr($filename, '.');
echo "\nExt: {$ext}\n";
$hash = sha1('some crapy^ fileane.jpg');
echo "Hash: {$hash}\n";
$time = time();
echo "Timestamp: {$time}\n";
$hashname = $time.'-'.$hash.$ext;
echo "Hashname: $hashname\n";
Ouputs
Ext: .jpg
Hash: bb9d2c2c7c73bb8248537a701870e35742b41c02
Timestamp: 1511853063
Hashname: 1511853063-bb9d2c2c7c73bb8248537a701870e35742b41c02.jpg
You can try it here
Paths never store the full path to the file. All you need in the database is the hash from creating the hashed name. The "root" path to the folder the file is stored in should be done in PHP. This has several benefits.
- prevents directory transferal. Because your not passing any part of the path around you don't have to worry as much about someone slipping a
\..\..
in there and going places they shouldn't. A poor example of this would be someone overwriting a.htpassword
file by uploading a file named that with directory transverse in it. - Has more uniform looking links, uniform size, uniform set of characters.
https://en.wikipedia.org/wiki/Directory_traversal_attack
- Maintenance. Paths change, Servers change. Demands on your system change. If you need to relocate those files, but you stored the absolute full path to them in the DB your stuck gluing everything together with
symlinks
or updating all your records.
There are some exceptions to this. If you want to store them in a monthly folder or by username. You could save that part of the path, in a seperate field. But even in that case, you could build it dynamically based on data saved in the record. I have found it's best to save as little path info as possible. And them make a config or a constant you can use in all the places you need to put the path to the file.
Also the path
and the link
are very different, so by saving only the name you can link it from whatever PHP page you want without having to subtract data from the path. I've always found it easier to add to the filename then to subtract from a path.
Database (just some suggestions, use may vary ) As always with data ask yourself, who, what, where, when
- id -
int
primary key auto increment - user_id -
int
foreign key, who uploaded it - hash -
char[40] *sha1*, unique
what the hash - hashname -
varchar
{timestampl}-{hash}.{ext} where the files name on the hard drive - filename -
varchar
the original name give by the user, that way we can show them the name they expect ( if that is important ) - status -
enum[public,private,deleted,pending.. etc]
status of the file, depending on your use case, you may have to review the files, or maybe some are private only the user can see them, maybe some are public etc. - status_date -
timestamp|datetime
time the status was changed. - create_date -
timestamp|datetime
when time the file was created, a timestamp is prefered as it makes some things easier but it should be the same timestamp use in the hashname, in that case. - type -
varchar
- mime type, can be useful for setting the mime type when downloading etc.
If you expect different users to upload the same file and you use the file_hash
you can make the hash
field a combined unique index of the user_id
and the hash
this way it would only conflict if the same user uploaded the same file. You could also do it based on the timestamp and hash, depending on your needs.
That's the basic stuff I could think of, this isn't an absolute just some fields I thought would be useful.
It's useful to have the hash by itself, if you store it by it's self you can store it in a CHAR(40)
for sha1 (takes up less space in the DB then VARCHAR
) and set the collation, to UTF8_bin
which is binary. This makes searches on it case sensitive. Although there is little possibility of a hash collision, this adds just a bit more protection because hashes are upper an lower case letters.
You can always build the hashname
on the fly if you store the extension, and the timestamp separate. If you find yourself creating things time and time again you may just want to store it in the DB to simplify the work in PHP.
I like just putting the hash in the link, no extension no anything so my links look like this.
http://www.example.com/download/ad87109bfff0765f4dd8cf4943b04d16a4070fea
Real simple, real generic, safe in urls always the same size etc..
The hashname
for this "file" would be like this
1511848005-ad87109bfff0765f4dd8cf4943b04d16a4070fea.jpg
If you do have conflicts with the same file and different user(which I mentioned above). You can always add the timestamp part into the link, the user_id or both. If you use the user_id, it might be useful to left pad it with zeros. For example some users may have ID:1
and some may be ID:234
so you could left pad it to 4 places and make them 0001
and 0234
. Then add that to the hash, which is almost unnoticeable:
1511848005-ad87109bfff0765f4dd8cf4943b04d16a4070fea0234.jpg
The important thing here is that because sha1
is always 40
and the id is always 4
we can separate the two accurately and easily. And this way, you can still look it up uniquely. There are a lot of different options but so much depends on your needs.
Access
Such as downloading. You should always output the file with PHP, don't give them direct access to the file. The best way is to store the files outside of the webroot ( above the public_html
, or www
folder ). Then in PHP you can set the headers to the correct type ans basically read out the file. This works for pretty much everything except video. I don't handle videos so that's a topic outside of my experience. But I find it best to think of it as all file data is text, its the headers that make that text into an image, or an excel file or a pdf.
The big advantage of not giving them direct access to the file is if you have a membership site, of don't want your content accessible without a login, you can easily check in PHP if they are logged in before giving them the content. And, as the file is outside the webroot, they can't access it any other way.
The most important thing is to pick something consistent, that is still flexible enough to handle all your needs.
I'm sure I can come up with more, but if you have any suggest feel free to comment.
BASIC PROCESS FLOW
- User submits form (
enctype="multipart/form-data"
)
https://www.w3schools.com/tags/att_form_enctype.asp
- Server receives the post from the form, Super Globals
$_POST
and the$_FILES
http://php.net/manual/en/reserved.variables.files.php
$_FILES = [
'fieldname' => [
'name' => "MyFile.txt" // (comes from the browser, so treat as tainted)
'type' => "text/plain" // (not sure where it gets this from - assume the browser, so treat as tainted)
'tmp_name' => "/tmp/php/php1h4j1o" // (could be anywhere on your system, depending on your config settings, but the user has no control, so this isn't tainted)
'error' => "0" //UPLOAD_ERR_OK (= 0)
'size' => "123" // (the size in bytes)
]
];
Check for errors
if(!$_FILES['fielname']['error'])
Sanitize display name
$filename = htmlentities($str, ENT_NOQUOTES, "UTF-8");
Save file, create DB record ( PSUDO-CODE )
Like this:
$path = __DIR__.'/uploads/'; //for exmaple
$time = time();
$hash = hash_file('sha1',$_FILES['fielname']['tmp_name']);
$type = $_FILES['fielname']['type'];
$hashname = $time.'-'.$hash.strrchr($_FILES['fielname']['name'], '.');
$status = 'pending';
if(!move_uploaded_file ($_FILES['fielname']['tmp_name'], $path.$hashname )){
//failed
//do somehing for errors.
die();
}
//store record in db
http://php.net/manual/en/function.move-uploaded-file.php
Create link ( varies based on routing ), the simple way is to do your link like this
http://www.example.com/download?file={$hash}
but it's uglier thenhttp://www.example.com/download/{$hash}
user clicks link goes to download page.
get INPUT and look up record
$hash = $_GET['file'];
$stmt = $PDO->prepare("SELECT * FROM attachments WHERE hash = :hash LIMIT 1");
$stmt->execute([":hash" => $hash]);
$row = $stmt->fetch(PDO::FETCH_ASSOC);
print_r($row);
http://php.net/manual/en/intro.pdo.php
Etc....
Cheers!