Mapping database structure from SQL Server to Dyna

2019-08-27 16:05发布

问题:

I am thinking about using a NoSQL database to scale database reads. Please see the relational database structure below:

CREATE TABLE Person(
      ID uniqueidentifier not null, 
      Name varchar(100), 
      DateOfBirth datetime)

CREATE TABLE Sport (
      ID uniqueidentifier not null, 
      Description varchar(50)) -- e.g. Football; Tennis; Badminton etc

CREATE TABLE PersonPlaysSport (
      PersonID uniqueidentifier FOREIGN KEY REFERENCES Person(ID), 
      SportID uniqueidentifier FOREIGN KEY REFERENCE Sport (ID), 
      primary key (PersonID, SportID)

In the example above a Person Plays many Sports. In my real application; I have many-to-many relationships like this that do not perform well.

How would these be stored in a NoSQL document database (DynamoDB)?

回答1:

Disclaimer - I'm not familiar with DynamoDb, but have used several other NoSql databases

The common approach is to choose the most important subject entity as the root of the document (in your case, I would say this is Person)

A document is then created for each person, and will include the "person centric" view of all associated entities (i.e. linked sports):

Joe (Person, Keyed on a natural, or surrogate id).
+ Fields of Joe (Date of Birth, etc)
+ SportsPlayed: (Collection)
--> Golf (Sport)
--> Tennis (Sport)

If it becomes important to view the relationship from a Sport centric approach (e.g. you need to know which persons are 'subscribed' to which Sport):

  • You could attempt a secondary index on Person.Sport, if the NoSql database allows this. This would allow for queries like "Who plays Golf?", although this approach is often frowned upon in NoSql terms.

  • Alternatively, and preferably, create a second collection of documents, this time keyed by Sport:

Golf (Sport)
- Joe
- Jim
...

etc. Obviously there's extra work to be done in keeping both sets of documents up to date when a change is made to a Person, a Sport, or the relationship between them, however the benefit is high performance on the read side - only a single document needs to be retrieved to pull the entire entity graph - In SQL terms, this would have required a Query joining 3 distinct tables.