How do you combine PDFs in ruby?

2019-02-01 23:18发布

问题:

This was asked in 2008. Hopefully there's a better answer now.

How can you combine PDFs in ruby?

I'm using the pdf-stamper gem to fill out a form in a PDF. I'd like to take n PDFs, fill out a form in each of them, and save the result as an n-page document.

Can you do this with a native library like prawn? Can you do this with rjb and iText? pdf-stamper is a wrapper on iText.

I'd like to avoid using two libraries (i.e. pdftk and iText), if possible.

回答1:

As of 2013 you can use Prawn to merge pdfs. Gist: https://gist.github.com/4512859

class PdfMerger

  def merge(pdf_paths, destination)

    first_pdf_path = pdf_paths.delete_at(0)

    Prawn::Document.generate(destination, :template => first_pdf_path) do |pdf|

      pdf_paths.each do |pdf_path|
        pdf.go_to_page(pdf.page_count)

        template_page_count = count_pdf_pages(pdf_path)
        (1..template_page_count).each do |template_page_number|
          pdf.start_new_page(:template => pdf_path, :template_page => template_page_number)
        end
      end

    end

  end

  private

  def count_pdf_pages(pdf_file_path)
    pdf = Prawn::Document.new(:template => pdf_file_path)
    pdf.page_count
  end

end


回答2:

After a long search for a pure Ruby solution, I ended up writing code from scratch to parse and combine/merge PDF files.

(I feel it is such a mess with the current tools - I wanted something native but they all seem to have different issues and dependencies... even Prawn dropped the template support they use to have)

I posted the gem online and you can find it at GitHub as well.

you can install it with:

gem install combine_pdf

It's very easy to use (with or without saving the PDF data to a file).

For example, here is a "one-liner":

(CombinePDF.load("file1.pdf") << CombinePDF.load("file2.pdf") << CombinePDF.load("file3.pdf")).save("out.pdf")

If you find any issues, please let me know and I will work on a fix.



回答3:

Use ghostscript to combine PDFs:

 options = "-q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite"
 system "gs #{options} -sOutputFile=result.pdf file1.pdf file2.pdf"


回答4:

I wrote a ruby gem to do this — PDF::Merger. It uses iText. Here's how you use it:

pdf = PDF::Merger.new
pdf.add_file "foo.pdf"
pdf.add_file "bar.pdf"
pdf.save_as "combined.pdf"


回答5:

Haven't seen great options in Ruby- I got best results shelling out to pdftk:

system "pdftk #{file_1} multistamp #{file_2} output #{file_combined}"


回答6:

We're closer than we were in 2008, but not quite there yet.

The latest dev version of Prawn lets you use an existing PDF as a template, but not use a template over and over as you add more pages.



回答7:

Via iText, this will work... though you should flatten the forms before you merge them to avoid field name conflicts. That or rename the fields one page at a time.

Within PDF, fields with the same name share a value. This is usually not the desired behavior, though it comes in handy from time to time.

Something along the lines of (in java):

PdfCopy mergedPDF = new PdfCopy( new Document(), new FileOutputStream( outPath );

for (String path : paths ) {
  PdfReader reader = new PdfReader( path );
  ByteArrayOutputStream curFormOut = new ByteArrayOutputStream();
  PdfStamper stamper = new PdfStamper( reader, curFormOut );

  stamper.setField( name, value ); // ad nauseum

  stamper.setFlattening(true); // flattening setting only takes effect during close()
  stamper.close();

  byte curFormBytes = curFormOut.toByteArray();
  PdfReader combineMe = new PdfReader( curFormBytes );

  int pages = combineMe .getNumberOfPages();
  for (int i = 1; i <= pages; ++i) { // "1" is the first page
    mergedForms.addPage( mergedForms.getImportedPage( combineMe, i );
  }
}

mergedForms.close();


回答8:

If you want to add any template (created by macOS Pages or Google Docs) using the combine_pdf gem then you can try with this:

final_pdf = CombinePDF.new
company_template = CombinePDF.load(template_file.pdf).pages[0]
pdf = CombinePDF.load (content_file.pdf)
pdf.pages.each {|page| final_pdf << (company_template << page)} 
final_pdf.save "final_document.pdf"