What scalability problems have you encountered usi

2019-01-12 12:57发布

NoSQL refers to non-relational data stores that break with the history of relational databases and ACID guarantees. Popular open source NoSQL data stores include:

  • Cassandra (tabular, written in Java, used by Cisco, WebEx, Digg, Facebook, IBM, Mahalo, Rackspace, Reddit and Twitter)
  • CouchDB (document, written in Erlang, used by BBC and Engine Yard)
  • Dynomite (key-value, written in Erlang, used by Powerset)
  • HBase (key-value, written in Java, used by Bing)
  • Hypertable (tabular, written in C++, used by Baidu)
  • Kai (key-value, written in Erlang)
  • MemcacheDB (key-value, written in C, used by Reddit)
  • MongoDB (document, written in C++, used by Electronic Arts, Github, NY Times and Sourceforge)
  • Neo4j (graph, written in Java, used by some Swedish universities)
  • Project Voldemort (key-value, written in Java, used by LinkedIn)
  • Redis (key-value, written in C, used by Craigslist, Engine Yard and Github)
  • Riak (key-value, written in Erlang, used by Comcast and Mochi Media)
  • Ringo (key-value, written in Erlang, used by Nokia)
  • Scalaris (key-value, written in Erlang, used by OnScale)
  • Terrastore (document, written in Java)
  • ThruDB (document, written in C++, used by JunkDepot.com)
  • Tokyo Cabinet/Tokyo Tyrant (key-value, written in C, used by Mixi.jp (Japanese social networking site))

I'd like to know about specific problems you - the SO reader - have solved using data stores and what NoSQL data store you used.

Questions:

  • What scalability problems have you used NoSQL data stores to solve?
  • What NoSQL data store did you use?
  • What database did you use prior to switching to a NoSQL data store?

I'm looking for first-hand experiences, so please do not answer unless you have that.

14条回答
地球回转人心会变
2楼-- · 2019-01-12 13:32

I apologize for going against your bold text, since I don't have any first-hand experience, but this set of blog posts is a good example of solving a problem with CouchDB.

CouchDB: A Case Study

Essentially, the textme application used CouchDB to deal with their exploding data problem. They found that SQL was too slow to deal with large amounts of archival data, and moved it over to CouchDB. It's an excellent read, and he discusses the entire process of figuring out what problems CouchDB could solve and how they ended up solving them.

查看更多
虎瘦雄心在
3楼-- · 2019-01-12 13:41

We replaced a postgres database with a CouchDB document database because not having a fixed schema was a strong advantage to us. Each document has a variable number of indexes used to access that document.

查看更多
我只想做你的唯一
4楼-- · 2019-01-12 13:41

I would encourage anyone reading this to try Couchbase once more now that 3.0 is out the door. There are over 200 new features for starters. The performance, availability, scalability and easy management features of Couchbase Server makes for an extremely flexible, highly available database. The management UI is built-in and the APIs automatically discover the cluster nodes so there is no need for a load balancer from the application to the DB. While we don't have a managed service at this time you can run couchbase on things like AWS, RedHat Gears, Cloudera, Rackspace, Docker Containers like CloudSoft, and much more. Regarding rebalancing it depends on what specifically you're referring to but Couchbase doesn't automatically rebalance after a node failure, as designed, but an administrator could setup auto failover for the first node failure and using our APIs you can also gain access to the replica vbuckets for reading prior to making them active or using the RestAPI you can enforce a failover by a monitoring tool. This is a special case but is possible to be done.

We tend not to rebalance in pretty much any mode unless the node is completely offline and never coming back or a new node is ready to be balanced in automatically. Here are a couple of guides to help anyone interested in seeing what one of the most highly performing NoSQL databases is all about.

  1. Couchbase Server 3.0
  2. Administration Guide
  3. REST API
  4. Developer Guides

Lastly, I would also encourage you to check out N1QL for distributed querying:

  1. N1QL Tutorial
  2. N1QL Guide

Thanks for reading and let me or others know if you need more help!

Austin

查看更多
兄弟一词,经得起流年.
5楼-- · 2019-01-12 13:45

Todd Hoff's highscalability.com has a lot of great coverage of NoSQL, including some case studies.

The commercial Vertica columnar DBMS might suit your purposes (even though it supports SQL): it's very fast compared with traditional relational DBMSs for analytics queries. See Stonebraker, et al.'s recent CACM paper contrasting Vertica with map-reduce.

Update: And Twitter's selected Cassandra over several others, including HBase, Voldemort, MongoDB, MemcacheDB, Redis, and HyperTable.

Update 2: Rick Cattell has just published a comparison of several NoSQL systems in High Performance Data Stores. And highscalability.com's take on Rick's paper is here.

查看更多
迷人小祖宗
6楼-- · 2019-01-12 13:45

I used redis to store logging messages across machines. It was very easy to implement, and very useful. Redis really rocks

查看更多
霸刀☆藐视天下
7楼-- · 2019-01-12 13:46

I have no first-hand experiences., but I found this blog entry quite interesting.

查看更多
登录 后发表回答