Hi I am newbie to mongodb.I am using java.
I have 4 tables Tenant,system,authorization in my relational table.
Something like this.
Table Fields
Tenant Tenant_ID(PK), Tenant_INFO
System System_ID(PK), System_Info
Authorization System_ID, Autho_Info.
System_prop System_ID, Prop_Info, Tenant_ID
In System_prop table, Tenant_ID refers the Tenant Table Tenant_ID (PK), System_ID refers the System Table System_ID.
In Authorization table, System_ID refers System tabel System_ID
I am switching my database from relational to mongodb. First thing I need to do is Schema design.
Query I need to do is:
SELECT A.Prop_Info, A.System_ID From System_prop A, SYSTEM B, TENANT C Where A.System_ID = B.System_ID AND A.Tenant_ID = C.Tenant_ID
SELECT A.System_ID, A.Prop_Info FROM Authoization A, SYSTEM B WHERE A.System_ID = B.System_ID
Can anyone help me how to design these tables as collections in mongodb?
Do i need to embed r use dbref? Help me to design the schema for this.
Your challenge comes from the fact that Prop_Info
must be retrieved by both queries. This makes it difficult to figure out which Mongo collection it should live in.
In MongoDB, you create your document schema with the ideal goal for a single document to have all the information you need given your query patterns. In the case where you need to have the same data D
(such as Prop_Info
in your case) returned by two separate queries against two separate collections A
and B
, you have to choose between the following three strategies:
Duplicate D
in the documents of both A
and B
, and enforce consistency with your code. This is typically the design choice of high-performance systems that want to eliminate the need for a second query even if that comes at the cost of additional code complexity on the insert/update side and with some potential consistency problems since Mongo is not ACID.
Put D
in A
and store a reference (DBRef or some other combination of identifying fields) in B
so that you can get to it with a second query. This is typically the design choice when the number of queries to A
exceeds the number of queries to B
. It keeps D
local to the more frequently queried collection. In this schema design pattern you only need to make a second query when you query B
.
Put D
in a new collection C
and make a second query to it from both A
and B
. This is typically the design choice in the face of very uncertain future requirements where it is not clear what the trade-offs would be if you go with (1) or (2) above. It is the most "relational-like" schema and the one that will force you to make a second query when you query both A
and B
.
Which strategy you choose depends on your domain, the query patterns, the support you get from your object-relational mapping (ORM) framework (if you use one), and last but not least, your preference.
In the situations I've encountered, I've never chosen (3). I've used (1) in high-performance situations (analytics systems). I've used (2) everywhere else since the query access patterns have made it obvious where the "shared" data should live.
Once you pick your strategy, if you still need assistance, post another SO question that specifically focuses on the schema design problem given the chosen strategy.
Three final tips:
If the shared data D
has a relationship multiplicity greater than 1 use an array. You can index entire arrays and you can query precisely inside arrays using $elemMatch
.
To update D
in strategy (1) or (2) use MongoDB's atomic modifier operations, many of which are designed to operate on arrays.
This SO question covers the DBRef two query pattern in @Stennie's answer. (@Stennie works for 10gen, the markers of MongoDB.)
Good luck!
You may need just one collection with all the documents, of course you will end up by having too many repeated field but this is the trick to scale well.
For the Relations the type one to many and one to one you will just remove the identifier and put the rest of the attributes because MongoDB will take care of the primary key. ('I love MongoDB for that').
For the many to many relationship that you have between Tenant and System you will have to change it to an array in MongoDB data structure.
coll{
Tenant : 'value',
tenant_info : 'value',
Sys_info: 'value' ,
auth_info: 'value' ,
Prop_info : array [ 'value','value',''value....]
}
You are still thinking in relational databases. MongoDB, however, is a document-oriented database.
- artificial ID numbers are usually not needed, because every document automatically has a _id field, which is a GUID (as good as guaranteed to be globally unique).
- relation tables should not be used in MongoDB. n-type relations are made with arrays fields instead. So when 1 system has N authorizations it uses, your system document should have a field "authorization" which is an array of the object IDs of the authorizations it has. Yes, that would be a horrible violation of the normalization rules of relational databases. But you don't have a relational database here. In MongoDB it is practical to represent N-relations with arrays, because arrays are transparent to the query language.