PDFkit hangs when generating a pdf with an image

2020-02-07 03:46发布

问题:

I want to render a webpage as a PDF. It uses a single image, and I've read that you need to supply the absolute URL for PDFkit to be able to use the image, so my code is:

= image_tag image_url(user.avatar)

This works when viewed as HTML, and PDFkit is able to generate a PDF with the image removed. However, when using the image, it just hangs until I kill the server. How can I get this to work?

Here's the full output when I kill the server:

2013-12-04 13:53:36.576 wkhtmltopdf[27410:507] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2013-12-04 13:53:36.577 wkhtmltopdf[27410:507] CoreText performance note: Set a breakpoint on CTFontLogSuboptimalRequest to debug.
2013-12-04 13:53:36.582 wkhtmltopdf[27410:507] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2013-12-04 13:53:36.584 wkhtmltopdf[27410:507] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
^C
RuntimeError - command failed: /usr/local/bin/wkhtmltopdf --page-size Legal --print-media-type --quiet - -:
  pdfkit (0.5.4) lib/pdfkit/pdfkit.rb:73:in `to_pdf'
  pdfkit (0.5.4) lib/pdfkit/middleware.rb:21:in `call'
  warden (1.2.3) lib/warden/manager.rb:35:in `block in call'
  warden (1.2.3) lib/warden/manager.rb:34:in `catch'
  warden (1.2.3) lib/warden/manager.rb:34:in `call'
  rack (1.5.2) lib/rack/etag.rb:23:in `call'
  rack (1.5.2) lib/rack/conditionalget.rb:25:in `call'
  rack (1.5.2) lib/rack/head.rb:11:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/params_parser.rb:27:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/flash.rb:241:in `call'
  rack (1.5.2) lib/rack/session/abstract/id.rb:225:in `context'
  rack (1.5.2) lib/rack/session/abstract/id.rb:220:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/cookies.rb:486:in `call'
  activerecord (4.0.0) lib/active_record/query_cache.rb:36:in `call'
  activerecord (4.0.0) lib/active_record/connection_adapters/abstract/connection_pool.rb:626:in `call'
  activerecord (4.0.0) lib/active_record/migration.rb:369:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/callbacks.rb:29:in `block in call'
  activesupport (4.0.0) lib/active_support/callbacks.rb:373:in `_run__4124003592524659480__call__callbacks'
  activesupport (4.0.0) lib/active_support/callbacks.rb:80:in `run_callbacks'
  actionpack (4.0.0) lib/action_dispatch/middleware/callbacks.rb:27:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/reloader.rb:64:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/remote_ip.rb:76:in `call'
  better_errors (1.0.1) lib/better_errors/middleware.rb:84:in `protected_app_call'
  better_errors (1.0.1) lib/better_errors/middleware.rb:79:in `better_errors_call'
  better_errors (1.0.1) lib/better_errors/middleware.rb:56:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/debug_exceptions.rb:17:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/show_exceptions.rb:30:in `call'
  railties (4.0.0) lib/rails/rack/logger.rb:38:in `call_app'
  railties (4.0.0) lib/rails/rack/logger.rb:21:in `block in call'
  activesupport (4.0.0) lib/active_support/tagged_logging.rb:67:in `block in tagged'
  activesupport (4.0.0) lib/active_support/tagged_logging.rb:25:in `tagged'
  activesupport (4.0.0) lib/active_support/tagged_logging.rb:67:in `tagged'
  railties (4.0.0) lib/rails/rack/logger.rb:21:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/request_id.rb:21:in `call'
  rack (1.5.2) lib/rack/methodoverride.rb:21:in `call'
  rack (1.5.2) lib/rack/runtime.rb:17:in `call'
  activesupport (4.0.0) lib/active_support/cache/strategy/local_cache.rb:83:in `call'
  rack (1.5.2) lib/rack/lock.rb:17:in `call'
  actionpack (4.0.0) lib/action_dispatch/middleware/static.rb:64:in `call'
  railties (4.0.0) lib/rails/engine.rb:511:in `call'
  railties (4.0.0) lib/rails/application.rb:97:in `call'
  rack (1.5.2) lib/rack/content_length.rb:14:in `call'
  thin (1.6.0) lib/thin/connection.rb:82:in `block in pre_process'
  thin (1.6.0) lib/thin/connection.rb:80:in `catch'
  thin (1.6.0) lib/thin/connection.rb:80:in `pre_process'
  thin (1.6.0) lib/thin/connection.rb:55:in `process'
  thin (1.6.0) lib/thin/connection.rb:41:in `receive_data'
  eventmachine (1.0.3) lib/eventmachine.rb:187:in `run_machine'
  eventmachine (1.0.3) lib/eventmachine.rb:187:in `run'
  thin (1.6.0) lib/thin/backends/base.rb:73:in `start'
  thin (1.6.0) lib/thin/server.rb:162:in `start'
  rack (1.5.2) lib/rack/handler/thin.rb:16:in `run'
  rack (1.5.2) lib/rack/server.rb:264:in `start'
  railties (4.0.0) lib/rails/commands/server.rb:84:in `start'
  railties (4.0.0) lib/rails/commands.rb:78:in `block in <top (required)>'
  railties (4.0.0) lib/rails/commands.rb:73:in `tap'
  railties (4.0.0) lib/rails/commands.rb:73:in `<top (required)>'
  bin/rails:4:in `require'
  bin/rails:4:in `<main>'

回答1:

This is a notorious issue, and you're running into it because your probably have relatively linked assets in your HTML (i.e. images, CSS, JS, fonts, etc), and your web server is only capable of handling one request/thread at a time (like WEBrick).

So what happens? The server begins generating the PDF when you request its URL. PDFkit finds a linked asset, so it tries to load this asset from the server, which happens to be the same server that PDFkit is running on. However, the server's single thread is already busy running PDFkit, so it cannot "free up" to serve the requested asset. In conclusion, it's a deadlock -- PDFkit is awaiting on an asset on the same server that is waiting for PDFkit to finish up processing, so that it can serve the asset to PDFkit...

Solution: either Base64-embed your assets in the HTML so that PDFkit doesn't need to make any additional requests (my personally preferred solution), or temporarily offload the assets to another server (e.g. a temporary AWS bucket). You can also try using the unicorn or Thin webserver with multi-threading enabled, or adding config.threadsafe! in in application.rb, but there is no guarantee that these methods will work.

Of course, these hacks (embedding assets or hosting elsewhere) should only be used in the dev environment -- you shouldn't be running into these kinds of issues in production, as the live server should (hopefully) be able to handle multiple GET requests.



回答2:

You can use unicorn in development which serve multiple request at a time. And is easy to setup. refer http://dave.is/unicorn.html and https://devcenter.heroku.com/articles/rails-unicorn

It solved my problem in development.



回答3:

Here's a workround helper

Arman's answer helped me a lot. Here's an implementation that I came up with.

module PdfHelper

  def pdf_image_tag(image_name)
    if Rails.env.development? || Rails.env.test?
      # Unless running a web server that can process 2 requests at the same
      # time, trying to insert an image in a PDF creates a deadlock: the server
      # can't finish processing the PDF request until it gets the image, but it
      # can't start processing the image request until it has finished
      # processing the PDF request.
      # This will not be a problem in production, but in dev, a workaround is
      # to base64 the image and insert its contents into the HTML
      image_data = Rails.application.assets[image_name].to_s
      image_tag "data:image/png;base64,#{::Base64.encode64(image_data)}"
    else
      image_tag image_name
    end
  end

end


回答4:

You are getting this error because by default WEBrick is a single threaded server and is capable of fulfilling one request at a time (because of the single worker thread).To get WEBrick fully multi-threaded in Rails 4.0, just add this to config/initializers/multithreaded_webrick.rb:

# Remove Rack::Lock so WEBrick can be fully multi-threaded.
require 'rails/commands/server'

class Rails::Server
  def middleware
    middlewares = []
    middlewares << [Rails::Rack::Debugger] if options[:debugger]
    middlewares << [::Rack::ContentLength]

    Hash.new middlewares
  end
end

add config.middleware.delete 'Rack::Lock' in your application.rb