I want to design an application for result computation.
First, I need to know how to store record in a MySQL database in such a way that students can have as many courses attached to them, e.g. student A can have 6 subjects attached to him, while student B can have 12 subject attached to him.
In this case, I need to know how I could possibly design a database structure that will allow a field to store as many subject as possible in form of an array.
Any suggestion or a better way to handle this will highly be appreciated.
For completeness sake, not in a matter that this is general recommended solution:
MySQL provides the the JSON datatype, which allows to store and retrieve objects and arrays in the JSON format.
This way, you can store entire objects and arrays into a field, as an array would just look like:
Especially beginners don't know this, and they reinvent the wheel by yet another comma-separated string implementation or using language-dependent serialization/deserialization approaches.
At least JSON is very commonly used and easily parsed as a data exchange format.
There are valid use cases for using storing arrays and objects inside a MySQL field, e.g. for speed optimization or when you have unknown or dynamic properties that you still want to save in a DB.
Yet as a rule of thumb, if you rely on storing objects and array into MySQL, then your database design is most likely broken.
Please read up on Data Normalization, General Indexing concepts, and Foreign Key constraints to keep data clean with referential integrity. This will get you going.
Storing data in arrays may seem natural to you on paper, but to the db engine the performance with mostly be without index use. Moreover, you will find on Day 2 that getting to and maintaining your data will be a nightmare.
The following should get you going with a good start as you tinker. Joins too.
Create Test Data
Some simple questions.
What course is in what department?
show all, uses table aliases (abbreviations) to make typing less, readability (sometimes) better
Who is taking the World of Chaucer course this term?
(knowing the courseId=5)
The below benefits from one of our composite indexes in SCJunction. A composite is an index on more than one column.
Kim Billings is enrolled in what this term?
Kim is overwhelmed, so drop drop the math class
run that above select statement showing what Kim is taking:
Ah, much easier term. Dad won't be happy though.
Note such things as SCJunction.term. Much can written about that, I will skip over it at the moment mostly, other than to say it should also be in an FK somewhere. You may want your term to look more like SPRING2015 and not an int.
And as far as id's go. This is the way I would do it. It is personal preference. It would require knowing id #'s, looking them up. Others could choose to have a courseId something like HIST101 and not 17. Those are highly more readable (but slower in the index (barely). So do what is best for you.
Note Composite Index
A Composite Index (INDEX means KEY, and vice-versa) is one that combines multiple columns for fast data retrieval. The orders are flipped for the two composites in the SCJunction table so that, depending on the universe of queries that go after your data, the db engine can choose which index to use for fastest retrieval based on the left-most column you are going after.
As for the unique key, #1, the comment next to it stating enforcing no duplicates (meaning junk data) is rather self-explanatory. For instance, student 1 course 1 term 1 cannot exist twice in that table.
A crucial concept to understand is the concept of
left-most
ordering of column names in an index.For queries that go after
studentId
only, then the key that hasstudentId
listed first (left-most
) is used. In queries that go aftercourseId
only, then the key that hascourseId
left-most is used. In queries that go after both studentId and courseId, the db engine can decide which composite key to use.When I say "go after", I mean in the
on clause
orwhere clause
condition.Were one not to have those two composite keys (with the column 1 and 2 in them flipped), then in queries where the column sought is not
left-most
indexed, you would not benefit with key usage, and suffer a slow tablescan for data to return.So, those two indexes combine the following 2 concepts
The Takeaway
The important takeaway is that Junction tables make for quick index retrieval, and sane management of data versus comma-delimited data (array mindset) crammed into a column, and all the misery of using such a construct.