Mixing PostgreSQL and MongoDB (as Django backends)

2019-01-31 14:59发布

问题:

I'm thinking about shifting my site's backend to Mongo from Postgres for performance reasons, but key parts of the site rely on the GeoDjango models to calculate distances between objects in the real world (and so on).

Would it be feasible to have most of the site running on Mongo but those key areas using Postgres for storage? Is this painful and / or error-prone? Is there an all-Mongo solution I'm missing?

Any light you can shed on these matters for me would be much appreciated.

回答1:

Since Django 1.2, you can define multiple datbase connections in your settings.py. Then you can use database routers to tell Django which database to go to, transparently for your application.

Disclaimer: this is how I think it should work, I have never used MongoDB in Django, nor have I tested that my code actually works. :)

settings.py

DATABASES = {
    'default': {
        'ENGINE': 'django_mongodb_engine',
        'NAME': 'mydata',
        ...
    }
    'geodata' {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'geodata',
        ...
    }
}

DATABASE_ROUTERS = ['path.to.ModelMetaRouter']

Models

Then add custom Meta variables to your geo-tables, to override their database. Don't add this attribute to models that are supposed to go to the default database.

class SomeGeoModel(models.Model):
    ...
    class Meta:
        using = 'geodata'

Database router

And write a database router to direct all models that have the using meta attribute set, to the appropriate connection:

class ModelMetaRouter(object):
    def db_for_read(self, model, **hints):
        return getattr(model._meta, 'using', None)

    def db_for_write(self, model, **hints):
        return getattr(model._meta, 'using', None)

    def allow_relation(self, obj1, obj2, **hints):
        # only allow relations within a single database
        if getattr(obj1._meta, 'using', None) == getattr(obj2._meta, 'using', None):
            return True
        return None

    def allow_syncdb(self, db, model):
        if db == getattr(model._meta, 'using', 'default'):
            return True
        return None


回答2:

you can't have 'using' in the Meta list.

here is a working solution

add this to models.py:

 import django.db.models.options as options
 options.DEFAULT_NAMES = options.DEFAULT_NAMES + ('in_db',)

create a router.py in your apps folder:

myapp folder content:
   models.py
   router.py
   ...

Content of router.py:

class ModelMetaRouter(object):
    def db_for_read(self, model, **hints):

        db = getattr(model._meta, 'in_db', None)   # use default database for models that dont have 'in_db'
        if db:
            return db
        else:
            return 'default'

    def db_for_write(self, model, **hints):
        db = getattr(model._meta, 'in_db', None)
        if db:
            return db
        else:
            return 'default'

    def allow_relation(self, obj1, obj2, **hints):
        # only allow relations within a single database
        if getattr(obj1._meta, 'in_db', None) == getattr(obj2._meta, 'in_db', None):
            return True
        return None

    def allow_syncdb(self, db, model):
        if db == getattr(model._meta, 'in_db', 'default'):
            return True
        return None

Reference router in your settings:

   DATABASE_ROUTERS = ['myapp.router.ModelMetaRouter']


回答3:

I would take a look at the Disqus talk from DjangoCan 2010 about their scaling architecture. They run quite possibly the largest Django website on top of Postgres. They present simple code snippets showing how to start both vertical and horizontal scaling using features built into Django.

My understanding is that they do use MongoDB for some of their analytics thought I don't think it's discussed in that talk.