Is there a quality, file-size, or other benefit to

2019-01-25 11:33发布

问题:

The JPEG compression encoding process splits a given image into blocks of 8x8 pixels, working with these blocks in future lossy and lossless compressions. [source]

It is also mentioned that if the image is a multiple 1MCU block (defined as a Minimum Coded Unit, 'usually 16 pixels in both directions') that lossless alterations to a JPEG can be performed. [source]

I am working with product images and would like to know both if, and how much benefit can be derived from using multiples of 16 in my final image size (say, using an image with size 480px by 360px) vs. a non-multiple of 16 (such as 484x362). In this example I am not interested in further alterations, editing, or recompression of the final image.

To try to get closer to a specific answer where I know there must be largely generalities: Given a 480x360 image that is 64k and saved at maximum quality in Photoshop [example]:

  • Can I expect any quality loss from an image that is 484x362
  • What amount of file size addition can I expect (for this example, the additional space would be white pixels)
  • Are there any other disadvantages to growing larger than the 8px grid?

I know it's arbitrary to use that specific example, but it would still be helpful (for me and potentially any others pondering an image size) to understand what level of compromise I'd be dealing with in breaking the non-8px grid.

The key issue here is a debate I've had is whether 8-pixel divisible images are higher quality than images that are not divisible by 8-pixels.

回答1:

8 pixels is the cutoff. The reason is because JPEG images are simply an array of 8x8 DCT blocks; if the image resolution isn't mod8 in both directions, the encoder has to pad the sides up to the next mod8 resolution. This in practice is not very expensive bit-wise; what's much worse are the cases when an image has sharp black lines (such as a letterboxed image) that don't lie on block boundaries. This is especially problematic in video encoding. The reason for this being a problem is that the frequency transform of a sharp line is a Gaussian distribution of coefficients--resulting in an enormous number of bits to code.

For those curious, the most common method of padding edges in intra compression (such as JPEG images) is to mirror the lines of pixels before the edge. For example, if you need to pad three lines and line X is the edge, line X+1 is equal to line X, line X+2 is equal to line X-1, and line X+3 is equal to line X-2. This quite effectively minimizes the cost in transform coefficients of the extra lines.

In inter coding, however, the padding algorithms generally simply duplicate the last line, because the mirror method does not work well for inter compression, such as in video compression.



回答2:

Sometimes you need to use 16 pixel boundaries rather than 8 because of subsampling; every 2nd pixel is thrown away during the encoding process, and those 8x8 DCT blocks started out as 16x16 and will decode back to 16x16. This won't be a problem at the highest quality settings.



回答3:

The image dimensions being multiples of 8 or 16 is not going to affect the size on disk very much, but you can get dramatic savings if you can line up the visual contents to the 8x8 pixel grid, such as if there is a repeating pattern or texture in the image.



回答4:

A JPG with sizes being multiplies of 8 can also be rotated/flipped with no quality loss. For example gthumb can do this on Linux.



回答5:

What Tometzky said. If you don't have the correct multiple, the lossless flip and rotate algorithms don't work. That's because the padding on the right/bottom that can be safely ignored now ends up on the left/top, where it can't.