s3cmd failed too many times-第2页回答

I used to be a happy s3cmd user. However recently when I try to transfer a large zip file (~7Gig) to Amazon S3, I am getting this error:

$> s3cmd put thefile.tgz s3://thebucket/thefile.tgz

....
  20480 of 7563176329     0% in    1s    14.97 kB/s  failed
WARNING: Upload failed: /thefile.tgz ([Errno 32] Broken pipe)
WARNING: Retrying on lower speed (throttle=1.25)
WARNING: Waiting 15 sec...
thefile.tgz -> s3://thebucket/thefile.tgz  [1 of 1]
       8192 of 7563176329     0% in    1s     5.57 kB/s  failed
ERROR: Upload of 'thefile.tgz' failed too many times. Skipping that file.

I am using the latest s3cmd on Ubuntu.

Why is it so? and how can I solve it? If it is unresolvable, what alternative tool can I use?

标签： file-upload ubuntu amazon-s3 backup

15条回答

狗以群分

2楼-- · 2019-01-13 01:59

Search for .s3cfg file, generally in your Home Folder.

If you have it, you got the villain. Changing the following two parameters should help you.

socket_timeout = 1000
multipart_chunk_size_mb = 15

0人赞添加讨论(0) 举报

三岁会撩人

3楼-- · 2019-01-13 02:00

For me, the following worked:

In .s3cfg, I changed the host_bucket

host_bucket = %(bucket)s.s3-external-3.amazonaws.com

0人赞添加讨论(0) 举报

闹够了就滚

4楼-- · 2019-01-13 02:03

This error occurs when Amazon returns an error: they seem to then disconnect the socket to keep you from uploading gigabytes of request to get back "no, that failed" in response. This is why for some people are getting it due to clock skew, some people are getting it due to policy errors, and others are running into size limitations requiring the use of the multi-part upload API. It isn't that everyone is wrong, or are even looking at different problems: these are all different symptoms of the same underlying behavior in s3cmd.

As most error conditions are going to be deterministic, s3cmd's behavior of throwing away the error message and retrying slower is kind of crazy unfortunate :(. Itthen To get the actual error message, you can go into /usr/share/s3cmd/S3/S3.py (remembering to delete the corresponding .pyc so the changes are used) and add a print e in the send_file function's except Exception, e: block.

In my case, I was trying to set the Content-Type of the uploaded file to "application/x-debian-package". Apparently, s3cmd's S3.object_put 1) does not honor a Content-Type passed via --add-header and yet 2) fails to overwrite the Content-Type added via --add-header as it stores headers in a dictionary with case-sensitive keys. The result is that it does a signature calculation using its value of "content-type" and then ends up (at least with many requests; this might be based on some kind of hash ordering somewhere) sending "Content-Type" to Amazon, leading to the signature error.

In my specific case today, it seems like -M would cause s3cmd to guess the right Content-Type, but it seems to do that based on filename alone... I would have hoped that it would use the mimemagic database based on the contents of the file. Honestly, though: s3cmd doesn't even manage to return a failed shell exit status when it fails to upload the file, so combined with all of these other issues it is probably better to just write your own one-off tool to do the one thing you need... it is almost certain that in the end it will save you time when you get bitten by some corner-case of this tool :(.

0人赞添加讨论(0) 举报

The star\"

5楼-- · 2019-01-13 02:04

I encountered the same broken pipe error as the security group policy was set wrongly.. I blame S3 documentation.

I wrote about how to set the policy correctly in my blog, which is:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": "arn:aws:s3:::example_bucket",
      "Condition": {}
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:DeleteObjectVersion",
        "s3:GetObject",
        "s3:GetObjectAcl",
        "s3:GetObjectVersion",
        "s3:GetObjectVersionAcl",
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:PutObjectAclVersion"
      ],
      "Resource": "arn:aws:s3:::example_bucket/*",
      "Condition": {}
    }
  ]
}

0人赞添加讨论(0) 举报

不美不萌又怎样

6楼-- · 2019-01-13 02:07

s3cmd version 1.1.0-beta3 or better will automatically use multipart uploads to allow sending up arbitrarily large files (source). You can control the chunk size it uses, too. e.g.

s3cmd --multipart-chunk-size-mb=1000 put hugefile.tar.gz s3://mybucket/dir/

This will do the upload in 1 GB chunks.

0人赞添加讨论(0) 举报

forever°为你锁心

7楼-- · 2019-01-13 02:08

I've just come across this problem myself. I've got a 24GB .tar.gz file to put into S3.

Uploading smaller pieces will help.

There is also ~5GB file size limit, and so I'm splitting the file into pieces, that can be re-assembled when the pieces are downloaded later.

split -b100m ../input-24GB-file.tar.gz input-24GB-file.tar.gz-

The last part of that line is a 'prefix'. Split will append 'aa', 'ab', 'ac', etc to it. The -b100m means 100MB chunks. A 24GB file will end up with about 240 100mb parts, called 'input-24GB-file.tar.gz-aa' to 'input-24GB-file.tar.gz-jf'.

To combine them later, download them all into a directory and:

cat input-24GB-file.tar.gz-* > input-24GB-file.tar.gz

Taking md5sums of the original and split files and storing that in the S3 bucket, or better, if its not so big, using a system like parchive to be able to check, even fix some download problems could also be valuable.

0人赞添加讨论(0) 举报

s3cmd failed too many times

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间