How to check if the file is having the same type a

2020-05-02 15:55发布

I want to build Javascript code which checks for the file type.In the web application which I am creating allows user to upload document files viz, doc, xls, ppt, docx, xlsx, pptx, txt, rar, zip, pdf, jpg, png, gif, jpeg, odt, but it should not allow other files. I can't check just extension name in file name. As user may change it.

I tried checking content-type but it is also getting changed everytime. Suggestions are appreciated.

2条回答
神经病院院长
2楼-- · 2020-05-02 16:15

Unless you can actually parse the content and by the results tell whether the file is of a certain type, I don't see a good way of doing that with pure JS. You might want to consider to upload the file to the sever temporarily, and then perform the check on the server. The unix file command is a very useful tool for that. It does not rely on file extensions, but uses the file content to analyze the file type.

查看更多
走好不送
3楼-- · 2020-05-02 16:28

In "modern" browsers (IE10+, Firefox 4+, Chrome 7+, Safari 6.0.2+ etc.), you could use the File/FileReader API to read the contents of the file and parse it client-side. E.g. (example, not production code):

var fileInput = /* Your <input type="file"> element */

fileInput.addEventListener("change", function(e) {
    var file = e.currentTarget.files[0];
    var reader = new FileReader();
    reader.onload = fileLoaded;
    reader.readAsArrayBuffer(file);
});

function fileLoaded(e)
{
   var arrayBuffer = e.currentTarget.result;

   // 32 indicates that we just want a look at the first 32 bytes of the buffer.
   // If we don't specify a length, we get the entire buffer.
   var bytes = new Uint8Array(arrayBuffer, 0, 32);

   // Now we can check the content, comparing to whatever file signatures we
   // have, e.g.:

   if (bytes[0] == 0x50 &&
       bytes[1] == 0x4b &&
       bytes[2] == 0x03 &&
       bytes[3] == 0x04)
   {
      // This is most likely docx, xlsx, pptx or other zip file.
   }
}

http://jsfiddle.net/35XfG/

Note, however, that e.g. a .zip doesn't have to start with 50 4b 03 04. So, unless you spend quite a bit of time looking into different file signatures (or find some library that already did this), you're likely to be rejecting files that might actually be valid. Of course, it's also possible that it will give false positives.

False positives don't matter that much in this case, though - because this is only useful as a user friendly measure to check that the user isn't uploading files that will be rejected by the server anyway. The server should always validate what it ends up getting sent.

Of course, reading the entire file to look at the first few bytes isn't all that efficient either. :-) See Ray Nicholus' comment about that.

查看更多
登录 后发表回答