Using aws.s3::get_bucket_df() returns errors when

2019-08-30 00:17发布

问题:

I have a repository in S3:

my_bucket:
    folder1
      subfolder11
        subfolder111
    folder2
      subfolder21
       subfolder221

I am trying to connect and load all files in all relevant folders in my bucket. Here is how I am trying to do this:

library(aws.s3)
bl <- bucketlist()

### Builds a dataframe of the files in a bucket###
dfBucket <- get_bucket_df(bucket = "my_bucket", prefix = "folder1/", max = Inf)

I am getting the following error:

Error in z[["Owner"]][["ID"]] : subscript out of bounds

Please advise.

UPDATE: I actually can run this command on other buckets and the issue is focusing on very long file names stored in this particular bucket in S3.

Please advise how to solve it given the new info.

回答1:

Solved by using rbindlist(dfBucket).

In my case get_bucket() did work and returned a list of bucket contents.

get_bucket_df() returned an error:

Error in z[["Owner"]][["ID"]] : subscript out of bounds

I have tried to find out what can solve my issue and used rbindlist which solved my issue.

Those who commented and criticized my answer, I am totally don't agree. If you know how to solve, please share your answer. It's not professional to throw critics without providing a solution!