Are there are production quality nosql stores that I can use on a production system. I have looked at cassandra, tokyodb, couchdb etc but none of them seem to be ready for deployments on production like environments. I am talking thousands of requests per minute and lots of reads/writes/updates. My only concern is speed and service times. Does anybody know of production systems that use nosql stores effectively ? Does anybody know of a nosql store that is backed by a big enterprise like Google/Yahoo/ IBM ?
问题:
回答1:
Cassandra handles thousands of requests (including write-mostly workloads) per second, per machine, and its scaling-by-adding-machines has been there since day 1.
Here is a thread about Cassandra use in production and in-production-soon at dozens of companies: http://n2.nabble.com/Cassandra-users-survey-td4040068.html#a4040068
We're also adding more docs all the time, like http://wiki.apache.org/cassandra/Operations.
回答2:
I think the NoSQL systems are an excellent choice if I you 'only' care about speed and service time (and not or less about stuff like consistency and transactions). Facebook uses Cassandra.
"Cassandra is used in Facebook as an email search system containing 25TB and over 100m mailboxes." http://highscalability.com/product-facebooks-cassandra-massive-distributed-store
I think CouchDb isn't really speedy, maybe you can use MongoDB: http://www.mongodb.org/display/DOCS/Production+Deployments
回答3:
Also worth consideration is using a traditional RDBMS like MySQL to store schema-less. This method gives you the stability of a proven database server like MySQL with the flexibility a NoSQL solution.
Check out this blog posting on how FriendFeed does this.
回答4:
BerkeleyDB is backed by Oracle
Using the native C interface one can reach close to 1 million read requests per second.
By the way, when you say thousands requests per minute, any 'normal' DB should be able to handle that easily too.
回答5:
Redis is worth giving a try as Github uses redis to manage a heavy queue of background jobs.
回答6:
My first instinct would be BerkeleyDB, with each application node on a SAMBA network to facilitate ACID conformance & network use. It also sports a SQLite interface. Other poster cites MemcacheDB also having BDB inside.
Another unique option would be OrientDB
, also has a SQL interface, lots of network & cluster features.