Read a CSV file from a stream using Roo in Rails 4

2019-08-08 18:36发布

问题:

I have another question on this here Open a CSV file from S3 using Roo on Heroku but I'm not getting any bites - so a reword:

I have a CSV file in an S3 bucket I want to read it using Roo in a Heroku based app (i.e. no local file access) How do I open the CSV file from a stream?

Or is there a better tool for doing this?

I am using Rails 4, Ruby 2. Note I can successfuly open the CSV for reading if I post it from a form. How can I adapt this to snap the file from an S3 bucket?

回答1:

Short answer - don't use Roo.

I ended up using the standard CSV commands, working with small CSV files you can very simply read the file contents into memory using something like this:

body = file.read
CSV.parse(body, col_sep: ",", headers: true) do |row|
    row_hash = row.to_hash
    field = row_hash["FieldName"]

reading a file passed in from a form, just reference the params:

file = params[:file]
body = file.read

To read in form S3 you can use the AWS gem:

s3 = AWS::S3.new(access_key_id: ENV['AWS_ACCESS_KEY_ID'], secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'])
bucket = s3.buckets['BUCKET_NAME']
# check each object in the bucket
bucket.objects.each do |obj|
    import_file = obj.key
    body = obj.read
    # call the same style import code as above...
end


回答2:

I put some code together based on this:

Make Remote Files Local With Ruby Tempfile

and Roo seems to work OK when handed a temp file. I couldn't get it to work with S3 directly. I don't particularly like the copy approach, but my processing runs on delayed job, and I want to keep the Roo features a little more than I dislike the file copy. Plain CSV files work without fishing out the encoding info, but XLS files would not.