How to handle a file_as_string (generated by Prawn

2019-04-20 11:54发布

问题:

I'm using Prawn to generate a PDF from the controller of a Rails app,

...
respond_to do |format|
  format.pdf do
    pdf = GenerateReportPdf.new(@object, view_context)
    send_data pdf.render, filename: "Report", type: "application/pdf", disposition: "inline"
  end
end

This works fine, but I now want to move GenerateReportPdf into a background task, and pass the resulting object to Carrierwave to upload directly to S3.

The worker looks like this

def perform
  pdf           = GenerateReportPdf.new(@object)
  fileString    = ???????
  document      = Document.new(
    object_id: @object.id,
    file: fileString )
    # file is field used by Carrierwave 
end

How do I handle the object returned by Prawn (?????) to ensure it is a format that can be read by Carrierwave.

fileString = pdf.render_file 'filename' writes the object to the root directory of the app. As I'm on Heroku this is not possible.

file = pdf.render returns ArgumentError: string contains null byte

fileString = StringIO.new( pdf.render_file 'filename' ) returns TypeError: no implicit conversion of nil into String

fileString = StringIO.new( pdf.render ) returns ActiveRecord::RecordInvalid: Validation failed: File You are not allowed to upload nil files, allowed types: jpg, jpeg, gif, png, pdf, doc, docx, xls, xlsx

fileString = File.open( pdf.render ) returns ArgumentError: string contains null byte

....and so on.

What am I missing? StringIO.new( pdf.render ) seems like it should work, but I'm unclear why its generating this error.

回答1:

It turns out StringIO.new( pdf.render ) should indeed work.

The problem I was having was that the filename was being set incorrectly and, despite following the advise below on Carrierwave's wiki, a bug elsewhere in the code meant that the filename was returning as an empty string. I'd overlooked this an assumed that something else was needed

https://github.com/carrierwaveuploader/carrierwave/wiki/How-to:-Upload-from-a-string-in-Rails-3

my code ended up looking like this

def perform
  s = StringIO.new(pdf.render)

  def s.original_filename; "my file name"; end

  document  = Document.new(
    object_id: @object.id
  )

  document.file = s

  document.save!
end


回答2:

You want to create a tempfile (which is fine on Heroku as long as you don't expect it to persist across requests).

def perform
  # Create instance of your Carrierwave Uploader
  uploader = MyUploader.new

  # Generate your PDF
  pdf = GenerateReportPdf.new(@object)

  # Create a tempfile
  tmpfile = Tempfile.new("my_filename")

  # set to binary mode to avoid UTF-8 conversion errors
  tmpfile.binmode 

  # Use render to write the file contents
  tmpfile.write pdf.render

  # Upload the tempfile with your Carrierwave uploader
  uploader.store! tmpfile

  # Close the tempfile and delete it
  tmpfile.close
  tmpfile.unlink
end


回答3:

Here's a way you can use StringIO like Andy Harvey mentioned, but without adding a method to the StringIO intstance's eigenclass.

class VirtualFile < StringIO
  attr_accessor :original_filename

  def initialize(string, original_filename)
    @original_filename = original_filename
    super(string)
  end
end

def perform
  pdf_string    = GenerateReportPdf.new(@object)
  file          = VirtualFile.new(pdf_string, 'filename.pdf')
  document      = Document.new(object_id: @object.id, file: file)
end


回答4:

This one took me couple of days, the key is to call render_file controlling the filepath so you can keep track of the file, something like this:

in one of my Models e.g.: Policy i have a list of documents and this is just the method for updating the model connected with the carrierwave e.g.:PolicyDocument < ApplicationRecord mount_uploader :pdf_file, PdfDocumentUploader

def upload_pdf_document_file_to_s3_bucket(document_type, filepath)
  policy_document = self.policy_documents.where(policy_document_type: document_type)
                        .where(status: 'processing')
                        .where(pdf_file: nil).last
  policy_document.pdf_file = File.open(file_path, "r")
  policy_document.status = 's3_uploaded'
  policy_document.save(validate:false)
  policy_document
  rescue => e
    policy_document.status = 's3_uploaded_failed'
    policy_document.save(validate:false)
    Rails.logger.error "Error uploading policy documents: #{e.inspect}"
  end
end

in one of my Prawn PDF File Generators e.g.: PolicyPdfDocumentX in here please note how im rendering the file and returning the filepath so i can grab from the worker object itself

  def generate_prawn_pdf_document
    Prawn::Document.new do |pdf|
      pdf.draw_text "Hello World PDF File", size: 8, at: [370, 462]
      pdf.start_new_page
      pdf.image Rails.root.join('app', 'assets', 'images', 'hello-world.png'), width: 550
    end
  end

def generate_tmp_file(filename)
   file_path = File.join(Rails.root, "tmp/pdfs", filename)
   self.generate_prawn_pdf_document.render_file(file_path)
   return filepath
end

in the "global" Worker for creating files and uploading them in the s3 bucket e.g.: PolicyDocumentGeneratorWorker

def perform(filename, document_type, policy)
 #here we create the instance of the prawn pdf generator class
 pdf_generator_class = document_type.constantize.new
 #here we are creating the file, but also `returning the filepath`
 file_path = pdf_generator_class.generate_tmp_file(filename)
 #here we are simply updating the model with the new file created
 policy.upload_pdf_document_file_to_s3_bucket(document_type, file_path)
end

finally how to test, run rails c and:

the_policy = Policies.where....
PolicyDocumentGeneratorWorker.new.perform('report_x.pdf', 'PolicyPdfDocumentX',the_policy)

NOTE: im using meta-programming in case we have multiple and different file generators, constantize.new is just creating new prawn pdf doc generator instance so is similar to PolicyPdfDocument.new that way we can only have one pdf doc generator worker class that can handle all of your prawn pdf documents so for instance if you need a new document you can simply PolicyDocumentGeneratorWorker.new.perform('report_y.pdf', 'PolicyPdfDocumentY',the_policy)

:D

hope this helps someone to save some time