I'm trying to determine if some files are actually images (using PHP). I've been advised to use finfo and i'm trying to understand how it works.
What I don't get is - what is a magic numbers database and how does it work? I'm a bit puzzled - does each file have certain "magic number" that you compare against that database?
Also - I have it on my debian squeeze - but will it also be available on WIN platform? or would one have to attach that database along with the app?
<?php
$finfo = new finfo(FILEINFO_MIME, "/usr/share/misc/magic.mgc");
if (!$finfo) {
echo "Opening fileinfo database failed";
exit();
}
/* get mime-type for a specific file */
$filename = "/usr/local/something.txt";
echo $finfo->file($filename);
?>
Would an alternate solution be to see if
exif_imagetype
returnsfalse
?Most file formats have a header that helps identify what kind of file it is. For example, GIF files always begin with
GIF87
The magic number database is a list of all headers and allows
finfo()
to id the files.Windows doesn't have this database installed by default. You would need to bring it along for windows. In fact you should use the same database no matter where you deploy to improve cross platform compatibility. Imagine if you deployed to an old system that doesn't know about filetypes your dev platform understands.
On my Ubuntu, it's in
/usr/share/file/magic.mime
. I don't know about Windows. And yes, typically various file formats have a specific prefix just for this purpose (even if there is no extension, you can recognise a GIF, for instance, by the fact that it always starts with the string "GIF").