Why are S3 and Google Storage bucket names a globa

2019-01-14 20:47发布

问题:

This has me puzzled. I can obviously understand why account ID's are global, but why bucket names?

Wouldn't it make more sense to have something like: https://accountID.storageservice.com/bucketName

Which would namespace buckets under accountID.

What am I missing, why did these obviously elite architects choose to handle bucket names this way?

回答1:

“The bucket namespace is global - just like domain names”

— http://aws.amazon.com/articles/1109#02

That's more than coincidental.

The reason seems simple enough: buckets and their objects can be accessed through a custom hostname that's the same as the bucket name... and a bucket can optionally host an entire static web site -- with S3 automatically mapping requests from the incoming Host: header onto the bucket of the same name.

In S3, these variant URLs reference the same object "foo.txt" in the bucket "bucket.example.com". The first one works with static website hosting enabled and requires a DNS CNAME (or Alias in Route 53) or a DNS CNAME pointing to the regional REST endpoint; the others require no configuration:

http://bucket.example.com/foo.txt
http://bucket.example.com.s3.amazonaws.com/foo.txt
http://bucket.example.com.s3[-region].amazonaws.com/foo.txt
http://s3[-region].amazonaws.com/bucket.example.com/foo.txt   

If an object store service needs a simple mechanism to resolve the Host: header in an HTTP incoming request into a bucket name, the bucket name namespace also needs to be global. Anything else, it seems, would complicate the implementation significantly.

For hostnames to be mappable to bucket names, something has to be globally unique, since obviously no two buckets could respond to the same hostname. The restriction being applied to the bucket name itself leaves no room for ambiguity.

It also seems likely that many potential clients wouldn't like to have their account identified in bucket names.

Of course, you could always add your account id, or any random string, to your desired bucket name, e.g. jozxyqk-payroll, jozxyqk-personnel, if the bucket name you wanted wasn't available.



回答2:

The more I drink the greater the concept below makes sense, so I've elevated it from a comment on the accepted answer to its own entity:

An additional thought that popped into my head randomly tonight:

Given the ability to use the generic host names that the various object store services provide, one could easily obscure your corporate (or other) identity as the owner of any given data resource.

So, let's say Black Hat Corp hosts a data resource at http://s3.amazonaws.com/obscure-bucket-name/something-to-be-dissassociated.txt‌​.

It would be very difficult for any non-governmental entity to determine who the owner of that resource is without co-operaton from the object store provider.

Not nefarious by design, just objective pragmatism.

And possibly a stroke of brilliance by the architects of this paradigm