I'm receiving a file in a request params through a standard file input
def create
file = params[:file]
upload = Upload.create(file: file, filename: "img.png")
end
However, for large uploads, I'd like to do this in a background job.
Popular background jobs options like Sidekiq or Resque depend on Redis to store the parameters, so I can't just pass a file object through redis.
I could use a Tempfile
, but on some platforms such as Heroku, local storage is not reliable.
What options do I have to make it reliable on "any" platform ?
I would suggest uploading directly to a service like Amazon S3 and then processing the file as you see fit in a background job.
When the user uploads the file, you can rest assure it will be safely stored in S3. You can use a private bucket for prohibiting public access. Then, in your background task you can process the upload by passing the file's S3 URI and let your background worker download the file.
I don't know what your background worker does with the file, but it goes without saying that downloading it again might not be necessary. It's stored somewhere after all.
I've used the carrierwave-direct gem in the past with success. Since you're mentioning Heroku, they have a detailed guide for uploading files directly to S3.
No tempfile
It sounds like you want to either speed up image uploading or push it into background. Here are my suggestions from another post. Maybe they'll help you if that's what you're looking for.
The reason I found this question is because I wanted to save a CSV file and have my background job add to the database with the info in that file.
I have a solution.
Because you the question is a bit unclear and I'm too lazy to post my own question and answer my own question, I'll just post the answer here. lol
Like the other dudes said, save the file on some cloud storage service. For Amazon, you need:
# Gemfile
gem 'aws-sdk', '~> 2.0' # for storing images on AWS S3
gem 'paperclip', '~> 5.0.0' # image processor if you want to use images
You also need this. Use the same code but different bucket name in production.rb
# config/environments/development.rb
Rails.application.configure do
config.paperclip_defaults = {
storage: :s3,
s3_host_name: 's3-us-west-2.amazonaws.com',
s3_credentials: {
bucket: 'my-bucket-development',
s3_region: 'us-west-2',
access_key_id: ENV['AWS_ACCESS_KEY_ID'],
secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
}
}
end
You also need a migration
# db/migrate/20000000000000_create_files.rb
class CreateFiles < ActiveRecord::Migration[5.0]
def change
create_table :files do |t|
t.attachment :import_file
end
end
end
and a model
class Company < ApplicationRecord
after_save :start_file_import
has_attached_file :import_file, default_url: '/missing.png'
validates_attachment_content_type :import_file, content_type: %r{\Atext\/.*\Z}
def start_file_import
return unless import_file_updated_at_changed?
FileImportJob.perform_later id
end
end
and a job
class FileImportJob < ApplicationJob
queue_as :default
def perform(file_id)
file = File.find file_id
filepath = file.import_file.url
# fetch file
response = HTTParty.get filepath
# we only need the contents of the response
csv_text = response.body
# use the csv gem to create csv table
csv = CSV.parse csv_text, headers: true
p "csv class: #{csv.class}" # => "csv class: CSV::Table"
# loop through each table row and do something with the data
csv.each_with_index do |row, index|
if index == 0
p "row class: #{row.class}" # => "row class: CSV::Row"
p row.to_hash # hash of all the keys and values from the csv file
end
end
end
end
In your controller
def create
@file.create file_params
end
def file_params
params.require(:file).permit(:import_file)
end
First you should save the file on storage(either local or AWS S3).
Then pass filepath or uuid as a parameter to background job.
I strongly recommend avoiding passing Tempfile on parameters. This stores object in memory which can get out of date, causing stale data problems.