Pass file to Active Job / background job

2019-07-14 07:00发布

I'm receiving a file in a request params through a standard file input

def create
  file = params[:file]
  upload = Upload.create(file: file, filename: "img.png")
end

However, for large uploads, I'd like to do this in a background job. Popular background jobs options like Sidekiq or Resque depend on Redis to store the parameters, so I can't just pass a file object through redis.

I could use a Tempfile, but on some platforms such as Heroku, local storage is not reliable.

What options do I have to make it reliable on "any" platform ?

3条回答
够拽才男人
2楼-- · 2019-07-14 07:36

First you should save the file on storage(either local or AWS S3). Then pass filepath or uuid as a parameter to background job.

I strongly recommend avoiding passing Tempfile on parameters. This stores object in memory which can get out of date, causing stale data problems.

查看更多
Juvenile、少年°
3楼-- · 2019-07-14 07:40

No tempfile

It sounds like you want to either speed up image uploading or push it into background. Here are my suggestions from another post. Maybe they'll help you if that's what you're looking for.

The reason I found this question is because I wanted to save a CSV file and have my background job add to the database with the info in that file.

I have a solution.

Because you the question is a bit unclear and I'm too lazy to post my own question and answer my own question, I'll just post the answer here. lol

Like the other dudes said, save the file on some cloud storage service. For Amazon, you need:

# Gemfile
gem 'aws-sdk', '~> 2.0' # for storing images on AWS S3
gem 'paperclip', '~> 5.0.0' # image processor if you want to use images

You also need this. Use the same code but different bucket name in production.rb

# config/environments/development.rb
Rails.application.configure do
  config.paperclip_defaults = {
    storage: :s3,
    s3_host_name: 's3-us-west-2.amazonaws.com',
    s3_credentials: {
      bucket: 'my-bucket-development',
      s3_region: 'us-west-2',
      access_key_id: ENV['AWS_ACCESS_KEY_ID'],
      secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
    }
  }
end

You also need a migration

# db/migrate/20000000000000_create_files.rb
class CreateFiles < ActiveRecord::Migration[5.0]
  def change
    create_table :files do |t|
      t.attachment :import_file
    end
  end
end

and a model

class Company < ApplicationRecord
  after_save :start_file_import

  has_attached_file :import_file, default_url: '/missing.png'
  validates_attachment_content_type :import_file, content_type: %r{\Atext\/.*\Z}

  def start_file_import
    return unless import_file_updated_at_changed?
    FileImportJob.perform_later id
  end
end

and a job

class FileImportJob < ApplicationJob
  queue_as :default

  def perform(file_id)
    file = File.find file_id
    filepath = file.import_file.url

    # fetch file
    response = HTTParty.get filepath
    # we only need the contents of the response
    csv_text = response.body
    # use the csv gem to create csv table
    csv = CSV.parse csv_text, headers: true
    p "csv class: #{csv.class}" # => "csv class: CSV::Table"
    # loop through each table row and do something with the data
    csv.each_with_index do |row, index|
      if index == 0
        p "row class: #{row.class}" # => "row class: CSV::Row"
        p row.to_hash # hash of all the keys and values from the csv file
      end
    end
  end
end

In your controller

def create
  @file.create file_params
end

def file_params
  params.require(:file).permit(:import_file)
end
查看更多
Animai°情兽
4楼-- · 2019-07-14 07:47

I would suggest uploading directly to a service like Amazon S3 and then processing the file as you see fit in a background job.

When the user uploads the file, you can rest assure it will be safely stored in S3. You can use a private bucket for prohibiting public access. Then, in your background task you can process the upload by passing the file's S3 URI and let your background worker download the file.

I don't know what your background worker does with the file, but it goes without saying that downloading it again might not be necessary. It's stored somewhere after all.

I've used the carrierwave-direct gem in the past with success. Since you're mentioning Heroku, they have a detailed guide for uploading files directly to S3.

查看更多
登录 后发表回答