Ghostscript grayscale conversion still contains co

2020-02-26 12:45发布

问题:

I need to convert a pdf in grayscale if it does contain colors. For this purpose i found a script which can determine if the pdf is already in grayscale or not.

convert "source.pdf" -colorspace RGB -unique-colors txt:- 2> /dev/null \
   | egrep -m 2 -v "#([0-9|A-F][0-9|A-F])\1{3}" \
   | wc -l

This counts how many colors with different values of RGB (so they are not gray) are present in the document.

If the pdf is not already a grayscale document i proceed with the conversion with ghostscript

gs \
  -sOutputFile=temp.pdf \
  -sDEVICE=pdfwrite \
  -sColorConversionStrategy=Gray \
  -dProcessColorModel=/DeviceGray \
  -dCompatibilityLevel=1.4 \
  -dNOPAUSE \
  -dBATCH \
   source.pdf < /dev/null

If i open the output document with a PDF viewer it shows without colors correctly. But if i try the first script on the new generated document it turns out that it still does contain some colors. How can i convert a document to precise grayscale? I need this because if i print this document with a color printer, the printer will use colors and not black to print gray.

回答1:

I value ImageMagick in general very much -- but don't trust convert to count the colors correctly with the command you're using...

May I suggest a different method to discover if a PDF page uses color? It is based on a (relatively new) Ghostscript device called inkcov (you need Ghostscript v9.05 or newer). It displays the ink coverage of CMYK for each single page (for RGB colors, it does a silent conversion to CMYK internally).

First, generate an example PDF with the help of Ghostscript:

gs \
  -o test.pdf \
  -sDEVICE=pdfwrite \
  -g5950x2105 \
  -c "/F1 {100 100 moveto /Helvetica findfont 42 scalefont setfont} def" \
  -c "F1                         (100% 'pure' black)   show showpage" \
  -c "F1 .5 .5 .5   setrgbcolor  (50% 'rich' rgbgray)  show showpage" \
  -c "F1 .5 .5 .5 0 setcmykcolor (50% 'rich' cmykgray) show showpage" \
  -c "F1 .5         setgray      (50% 'pure' gray)     show showpage"

While all the pages do appear to the human eye to not use any color at all, pages 2 and 3 do indeed mix their apparent gray values from color.

Now check each page's ink coverage:

gs  -o - -sDEVICE=inkcov test.pdf 
 [...]
 Page 1
 0.00000  0.00000  0.00000  0.02230 CMYK OK
 Page 2
 0.02360  0.02360  0.02360  0.02360 CMYK OK
 Page 3
 0.02525  0.02525  0.02525  0.00000 CMYK OK
 Page 4
 0.00000  0.00000  0.00000  0.01982 CMYK OK

(A value of 1.00000 maps to 100% ink coverage for the respective color channel. So 0.02230 in the first line of the result means 2.23 % of the page area is covered by black ink.) Hence the result given by Ghostscript's inkcov is exactly the expected one:

  • pages 1 + 4 don't use any of C (cyan), M (magenta), Y (yellow) colors, but only K (black).
  • pages 2 + 3 do use ink of C (cyan), M (magenta), Y (yellow) colors, but no K (black) at all.

Now let's convert all pages of the original PDF to use the DeviceGray colorspace:

gs \
 -o temp.pdf \
 -sDEVICE=pdfwrite \
 -sColorConversionStrategy=Gray \
 -sProcessColorModel=DeviceGray \
  test.pdf

...and check for the ink coverage again:

gs -q  -o - -sDEVICE=inkcov temp.pdf
 0.00000  0.00000  0.00000  0.02230 CMYK OK
 0.00000  0.00000  0.00000  0.02360 CMYK OK
 0.00000  0.00000  0.00000  0.02525 CMYK OK
 0.00000  0.00000  0.00000  0.01982 CMYK OK

Again, exactly the expected result in case of succesful color conversions! (BTW, your convert command returns 2 for me for both files, the [original] test.pdf as well as the [gray-converted] temp.pdf -- so this command cannot be right...)



回答2:

Maybe your document contains transparent figures. Try passing option

-dHaveTransparency=false

to your ghostscript conversion command. The full list of options for the pdfwrite device can be found at http://ghostscript.com/doc/current/Ps2pdf.htm#Options