How to detect if the jpg jpeg image file is corrup

2020-02-12 09:03发布

I have to show some images from others' image server on my website but some of the images from the image server can only partially show like below image enter image description here

The image included width and height info but only show very top part of image. If I open the image with Chrome v61, it looks like below image enter image description here

Chrome v61 shows this color to present the transparency in png image file but what does it mean in jpg jpeg image file?

Is there anyone knows how to detect this kind of corrupted(incomplete) image? I'm trying to aviod this kind of images showing on my website.

2条回答
时光不老,我们不散
2楼-- · 2020-02-12 09:49

If you need a "programmatic" approach rather than the command line approach suggested by @MarkSetchell, you could create a very quick test for this in pretty much any programming language. Note that this will only find the kind of truncating corruption you mention in your question. Mark's method may be more reliable for finding corruption in general.

As we know, any JPEG file or stream is written according to the JPEG Interchange Format. That means that they must start with a SOI (Start-of-Image) marker (the two bytes 0xFF, 0xD8) and end with the EOI (End-of-Image) marker (two bytes, 0xFF, 0xD9). These two markers will not be found anywhere else in a JPEG file/stream.

If you first identify the file as a JPEG by inspecting the first two bytes and matching against the SOI marker, you could skip to the end and search backwards for the EOI marker. Most likely, this will either be the last two bytes or you will not find them at all. But it may be safer to do a search (perhaps for a limited length), as I think it may be allowed to place application-specific data in a JPEG file after the EOI (someone, please correct me if I'm wrong).

查看更多
Luminary・发光体
3楼-- · 2020-02-12 09:53

I created a JPEG to test this using ImageMagick as follows:

convert -size 1024x768 gradient: image.jpg

and it was 14kB. Your image looks like it is incomplete, so I chopped off everything after 3kB like this:

dd if=image.jpg bs=3000 count=1 > corrupt.jpg

Now, if I run ImageMagick's identify command and discard stdout, just retaining stderr, I get:

identify -verbose corrupt.jpg > /dev/null

Sample Output

identify: Premature end of JPEG file `corrupt.jpg' @ warning/jpeg.c/JPEGWarningHandler/364.
identify: Corrupt JPEG data: premature end of data segment `corrupt.jpg' @ warning/jpeg.c/JPEGWarningHandler/364.

Alternatively, you could discard stderr too and simply look at the exit code (0=success, anything else=error):

identify -regard-warnings -verbose corrupt.jpg > /dev/null 2>&1
echo $?
1

whereas for a complete image:

identify -regard-warnings -verbose image.jpg > /dev/null 2>&1
echo $?
0

ImageMagick is installed on most Linux distros and is available for macOS/OSX and Windows.

查看更多
登录 后发表回答