We've got a file-based program we want to convert to use a document database, specifically MongoDB. Problem is, MongoDB is limited to 2GB on 32-bit machines (according to http://www.mongodb.org/display/DOCS/FAQ#FAQ-Whatarethe32bitlimitations%3F), and a lot of our users will have over 2GB of data. Is there a way to have MongoDB use more than one file somehow?
I thought perhaps I could implement sharding on a single machine, meaning I'd run more than one mongod on the same machine and they'd somehow communicate. Could that work?
The best way is managing the virtual storage of MongoDB documents.
The MongoDB's storage limit on different operating systems are tabulated below as per the MongoDB 3.0 MMAPv1 storage engine limits.
The MMAPv1 storage engine limits each database to no more than 16000 data files. This means that a single MMAPv1 database has a maximum size of 32TB. Setting the storage.mmapv1.smallFiles option reduces this limit to 8TB.
Using the MMAPv1 storage engine, a single mongod instance cannot manage a data set that exceeds maximum virtual memory address space provided by the underlying operating system.
Reference: MongoDB Database Limit.
Note:The WiredTiger storage engine is not subject to this limitation.
Hope This helps.
The only way to have more than 2GB on a single node is to run multiple mongod processes. So sharding is one option (like you said) or doing some manual partitioning across processes.
You could configure sharding because 2Gb limit only applies to individual mongodb processes. Pls refer the documentation sharded-clusters,and I also found Python Script to set-up sharded environment on a single machine.