I'd like to convert arbitrary PDF files to PDF/A with Ghostscript 9.15.
Is Ghostscript able to create PDF/A-3b conformant PDFs? There is no parameter which represents a PDF/A conformance level, so I assume there is no possibility. Or is there anything I have overlooked?
I was following a blog post where a Windows batch file is used to convert from PDF to PDF/A (see http://www.mcbsys.com/techblog/2013/04/batch-convert-pdf-to-pdfa/). The
gs
invokation in the batch is:"%gs_path%\gswin64c" ^ -dPDFA ^ -dNOOUTERSAVE ^ -sProcessColorModel=DeviceRGB ^ -sDEVICE=pdfwrite ^ -o "GS_%file1%" ^ -dPDFACompatibilityPolicy=1 ^ "%currentdir%\PDFA_def.ps" ^ %inputfilelist%
The PDFA_def.ps
is an adjusted version of the official one:
%!
% This prefix file for creating a PDF/A document is derived from
% the sample included with Ghostscript 9.07, released under the
% GNU Affero General Public License.
% Modified 4/15/2013 by MCB Systems.
% Feel free to modify entries marked with "Customize".
% This assumes an ICC profile to reside in the file (AdobeRGB1998.icc),
% unless the user modifies the corresponding line below.
% The color space described by the ICC profile must correspond to the
% ProcessColorModel specified when using this prefix file (GRAY with
% DeviceGray, RGB with DeviceRGB, and CMYK with DeviceCMYK).
% Define entries in the document Info dictionary :
/ICCProfile (... PATH TO ... AdobeRGB1998.icc) % Customize.
def
[ /Title (Title) % Customize.
/DOCINFO pdfmark
% Define an ICC profile :
[/_objdef {icc_PDFA} /type /stream /OBJ pdfmark
[{icc_PDFA} <</N systemdict /ProcessColorModel get /DeviceGray eq {1} {systemdict /ProcessColorModel get /DeviceRGB eq {3} {4} ifelse} ifelse >> /PUT pdfmark
[{icc_PDFA} ICCProfile (r) file /PUT pdfmark
% Define the output intent dictionary :
[/_objdef {OutputIntent_PDFA} /type /dict /OBJ pdfmark
[{OutputIntent_PDFA} <<
/Type /OutputIntent % Must be so (the standard requires).
/S /GTS_PDFA1 % Must be so (the standard requires).
/DestOutputProfile {icc_PDFA} % Must be so (see above).
/OutputConditionIdentifier (AdobeRGB1998) % Customize
>> /PUT pdfmark
[{Catalog} <</OutputIntents [ {OutputIntent_PDFA} ]>> /PUT pdfmark
So, I use AdobeRGB1998.icc which is obviously useable for PDF files with RGB color space. Depending on the -sProcessColorModel
value (DEVICERGB) a correct value is printed out.
The conversion works for all files. But when I validate the created PDF file against PDF/A-1b, I get different results depending whether the input file has RGB color space or not (e.g. CMYK). So, when I have an input PDF file which uses CMYK color space, the file gets converted by the script, but the validator says something like this:
input.pdf", 1, 38, 0x03418614, "A device-specific color space (DeviceCMYK) without an appropriate output intent is used.", 1
"output.pdf", 20, 0, 0x83410612, "The document does not conform to the requested standard.", 1
My question: Is there a way to get the conversion done for arbitrary files (i.e. independent of the used color space in the input file)?
Update
@KenS Thanks for your answer. I've updated my initial post to clarify what I want to achieve.
To make it more explicit, I will use an example. There are two files: input1.pdf
(seems to use RGB) and input2.pdf
(seems to use CMYK). I want to convert both of them to PDF/A-1. Thanks to your hint, I've let go of the above mentioned batch script and instead tested the command directly in the command line. After reading Ps2pdf.htm#PDFA, I have adjusted the (official) PDFA_def.ps so that AdobeRGB1998.icc is used. Then I invoked the following command on both input files (replaced output1.pdf by output2.pdf and input1.pdf by input2.pdf for the second file):
gswin64c.exe -dPDFA=1 -dBATCH -dNOPAUSE -dNOOUTERSAVE \
-sColorConversionStrategy=/RGB \
-sOutputICCProfile=AdobeRGB1998.icc -sDEVICE=pdfwrite \
-sOutputFile=output1.pdf -dPDFACompatibilityPolicy=1 \
"PATH/TO/OFFICIAL/PDFA_def.ps" input1.pdf
The conversion was done without any errors. The output1.pdf seems to be valid, but the output2.pdf is still invalid (tested with 3heights Validator):
"output2.pdf", 1, 40, 0x03418614, "A device-specific color space (DeviceCMYK) without an appropriate output intent is used.", 1
"output2.pdf", 20, 0, 0x83410612, "The document does not conform to the requested standard.", 1
So when I understand your answer correctly, the above command should produce a pdf file which uses the RGB color space - independent of the color space of the input file. If the input file uses CMYK, than the colors have to be translated into RGB with the above command.
When I interpret the first error message correctly, the used color space in the output2.pdf is still CMYK (although the command parameters like ColorConversionStrategy=/RGB). Since I used AdobeRGB1998.icc, the validation error appears.
What am I missing in the above command?
Going back to my original question (which is one step further): Instead of always converting to RGB (or CMYK), I wanted to somehow detect which color space is used in the input file and then dynamically switch to a RGB or CMYK icc file. Is it possible to achieve that?