limits of number of collections in databases

2019-01-21 09:40发布

问题:

Can anyone say are there any practical limits for the number of collections in mongodb? They write here https://docs.mongodb.com/manual/core/data-model-operations/#large-number-of-collections:

Generally, having a large number of collections has no significant performance penalty, and results in very good performance.

But for some reason mongodb set limit 24000 for the number of namespaces in the database, it looks like it can be increased, but I wonder why it has some the limit in default configuration if having many collections in the database doesn't cause any performance penalty?

Does it mean that it's a viable solution to have a practically unlimited number of collections in one database, for example, to have one collection of data of one account in a database for the multitenant application, having, for example, hundreds of thousands of collections in the database? If it's the viable solution to have a very large number of collections for a database for every tenant, what's the benefits of it for example versus having documents of each tenant in one collection? Thank you very much for your answers.

回答1:

This answer is late however the other answers seem a bit...weak in terms of reliability and factual information so I will attempt to remedy that a little.

But for some reason mongodb set limit 24000 for the number of namespaces in the database,

That is merely the default setting. Yes, there is a default setting.

It does say on the limits page that 24000 is the limit ( http://docs.mongodb.org/manual/reference/limits/#Number%20of%20Namespaces ), as though there is no way to expand that but there is.

However there is a maximum limit on how big a namespace file can be ( http://docs.mongodb.org/manual/reference/limits/#Size%20of%20Namespace%20File ) which is 2GB. That gives you roughly 3 million namespaces to play with in most cases which is quite impressive and I am unsure if many people will hit that limit quickly.

You can modify the default value to go higher than 16MB by using the nssize parameter either within the configuration ( http://docs.mongodb.org/manual/reference/configuration-options/#nssize ) or at runtime by manipulating the command used to run MongoDB ( http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--nssize ).

There is no real reason for why MongoDB implements 16MB by default for its nssize as far as I know, I have never heard about the motto of "not bother the user with every single detail" so I don't buy that one.

I think, in my opinion, the main reason why MongoDB hides this is because even though, as the documentation states:

Distinct collections are very important for high-throughput batch processing.

Using multiple collections as a means to scale vertically rather than horizontally through a cluster, as MongoDB is designed to, is considered (quite often) bad practice for large scale websites; as such 12K collections is normally considered something that people will never, and should never, ascertain.



回答2:

No More Limits!

As other answers have stated - this is determined by the size of the namespace file. This was previously an issue, because it had a default limit of 16mb and a max of 2gb. However with the release of MongoDB 3.0 and the WiredTiger storage engine, it looks like this limit has been removed. WiredTiger seems to be better in almost every way, so I see little reason for anyone to use the old engine, except for legacy support reasons. From the site:

For the MMAPv1 storage engine, namespace files can be no larger than 2047 megabytes.

By default namespace files are 16 megabytes. You can configure the size using the nsSize option.

The WiredTiger storage engine is not subject to this limitation.

http://docs.mongodb.org/manual/reference/limits/



回答3:

A little background:

Every time mongo creates a database, it creates a namespace (db.ns) file for it. The namespace (or collections as you might want to call it) file holds the metadata about the collection. By default the namespace file is 16MB in size, though you can increase the size manually. The metadata for each collections is 648 bytes + some overhead bytes. Divide that by 16MB and you get approximately 24000 namespaces per database. You can start mongo by specifying a larger namespace file and that will let you create more collections per database.

The idea behind any default configuration is to not bother the user with every single detail (and configurable knob) and choose one that generally works for most people. Also, viability does go hand in hand with best/good design practices. As Chris said, consider the shape of your data and decide accordingly.



回答4:

As others mention, the default namespace size is 16MB and you can get about 24000 namespace entries. Actually my 64 bit instance in Ubuntu topped out at 23684 using the default 16MB namespace file.

One important thing that isn't mentioned in the FAQ is that indexes also use namespace slots.

You can count the namespace entries with:

db.system.namespaces.count()

And it's also interesting to actually take a look at what's in there:

db.system.namespaces.find()

Set your limit higher than what you think you need because once a database is created, the namespace file cannot be extended (as far as I understand - if there is a way, please tell me!!!).



回答5:

Practically, I have never run across a maximum. But I've definitely never gone beyond the 24,000 collection limit. I'm pretty sure I've never hit more than 200, other than when I was performance testing the thing. I have to admit, I think it sounds like an awful lot of chaos to have that many collections in a single database, rather than grouping like data in to their own collections.

Consider the shape of your data and business rules. If your data needs to be laid out such that you must have the data separated in to different logical groupings for your multi-tenant app, then you probably should consider other data stores. Because while Mongo is great, the fact that they put a limit on the amount of collections at all tells me that they know there is some theoretical limit where performance is effected.

Perhaps you should consider a store that would match the data shape? Riak, for example, has an unlimited number of 'buckets' (without theoretical maximum) that you can have in your application. One bucket per account is perfectly doable, but you sacrifice some querability by going that direction.

Otherwise, you may want to follow a more relational model of grouping like with like. In my view, Mongo feels like a half-way point between relational databases and key-value stores. That means that it's more easy to conceptualize it coming from a relational database world.



回答6:

There seems to be a massive overhead for maintaining collections. I've just reduced a database which had around 1.5mio documents in 11000 collections to one with the same number of documents in around 300 collections; this has reduced the size of the database from 8GB to 1GB. I'm not familiar with the inner workings of MongoDB so this may be obvious but I thought might be worth noting in this context.