My question is: In order to exist which of the following needs obligatory to have key and attribute?
- entity
- entity-type
- relationship-types
- relations (tuple with relationship-types)
I know my question is a little bit odd, but I am not quite sure what are the rules by building an ER-Diagram. In other words what should mandatory be present in my diagram and what not (e.g Attributes). Thanks in advance.
In the original Entity-Relationship method, we identfy entity types and relationship/association types and their relevant attributes. When drawing an ER diagram we pick one or more groups of attributes that can identify an instance of a type as a key. If there isn't such an identifying group of attributes then we have to introduce an attribute as a surrogate. A type's value set (the set of its instances) will be represented by a relation/table.
From Chen's 1976 The Entity-Relationship Model-Toward a Unified View of Data:
These are the only attributes we must have. (Surrogate attributes arise from this necessity.)
If you are not interested in any non-key attributes of an entity type or relationship type, then you're not going to have any other attributes for it in your design/diagram.
An ER diagram shows entity types and relationship types. It doesn't show entity instances or relationship instances.
An entity exists only if it has properties/attributes. You can not define entities without properties/attributes, even if they have primary key.
Entity types in partitioning relationships (represented by the triangle symbol) exist without primary key in a Conceptual Model. But they exist only if they possess attributes equally. Once decomposed the relationship, in the Physical Model, they will have the primary key of your super entity as a foreign key and also as a primary key.
1-1 and 1-N (or N-1) relationships do not have attributes/properties. On the other hand, N-N relations may or may not have properties/attributes. If you need to set attributes/properties to N-1 (or 1-N) relationships, they should be on the N side.
The Conceptual Model may have N-N relations, but the Logic Model does not. When decomposed, the Logical Model should end up with only 1-1 relationships or/and 1-N (or N-1) relationships.
Any questions, please comment and I will answer.
EDIT
The @reaanb user raised an interesting question. I think it is valid to add content as the answer itself. You (the author of the question) can even remove some doubts about what I said. So let's see...
When you decomposes 1-1 or 1-N relations that has attributes, while maintaining the attributes in the relationship (according to what has been mentioned in the comments), you will have to create another table that need to at least have a foreign key to one of the sides of the relationship.
In terms of functionality, keep attributes on the N side can save you the creation of this table (in case, you don't have to create this table). One less table within the final model is one less join operation, which can be costly if you need to do many times.
In terms of readability, 1-N relationships represent unilateral relations. Something that, in my view, always creates a dependency on a single side, not on both. Because of this, changes to the N side are simply irrelevant to the 1 side. In programming, development of programs and systems, at least, it follows the concept of decreasing the coupling between the parties, Which in this case, is a good thing (even for the development of databases).
In 1-N (or N-1) relationships that have attributes/properties, as I have said, the attributes are put on the N side, because/due to unilateral relationship created they are relevant only to the N side.
The only situation where this can complicate is for 1-1 relations. In this type of occasion, we have to consider two things:
1-1 relations may be confusing as they may (at first sight) represent unilateral relationships. But it is rule that in every way we need to define a foreign key, right? We need to create a relationship after all. For so cases, it is the discretion of the Data Modeler. So we generally put the foreign key on the side that can eventually become N. But this is not mandatory. Again, you must have discretion.
Now, and when 1-1 relations have attributes? That's a lot, if not impossible (I kindly ask an example of context if anyone disagree with a plausible explanation of a real case, and if it exists, be sure that I will take back what I said). If it happens it's good to review your diagram. Ask yourself immediately:
Attributes in relationships are great indications that we have N-N or 1-N (or N-1) relationships happening, or that you are putting attributes in the wrong place. In this case simply just follow the aforementioned rules here.
How about a quick example to show practice (about 1-1 relationships)? Let's imagine the following situation (context):
This would give us the following Conceptual Model:
So far so good, right? Now let's add one more detail:
Where will you add this information? Do not see the answer yet. Stop to think a little. Stop to think, really. Take a look at the context. This information is relevant to whom? To both sides? Let us develop a line of thought together...
Let's suppose we were to add a multi-valued attribute named "hours of work", right? This attribute when decomposed will become a new entity with at least two new attributes/fields: the days and the hours an employee worked in an office.
Note: Ideally, we transform this property into one entity directly, but for the sake of understanding, we will leave it as an attribute.
This information needs to be persisted, so Where should we put this? In the relationship? In Employees entity? In Offices entity? Where?
If we use the relationship, then we are saying that when we do the decomposition of the Conceptual Model, the relationship will become a new table. This table will have at least three fields:
The first two fields are obvious. The last field will represent a foreign key to a new table that arose from the decomposition of a multi-valued attribute.
This relationship can accurately show us who, in which office, worked in which days for how long. But here are the caveats:
Let us dismiss this idea for now to examine another case. Let's say we did not put the attribute in the relationship, and we still have to put it somewhere. What would happen if we put the attribute in Offices entity? Does it makes sense to store this information here? What if in the future an office can have several employees working at it? Is this the best place to store this information?
If this should happen, keep this information here will not help. This attribute keeps the workload of just one employee. With several employees working in an office, you would not have, even with a SQL join, how to say who worked how long in what office. You would have anyway to create a relationship between the three: employees, offices, and working hours, and again we would go back to the relationship made (1-1 with attributes till it becomes a new table).
If in the future, an office can have several employees working on, maybe we should put this attribute/property in Employees entity (the N side), right? Take a look:
See how beautiful it is. When all this get decomposed, we will have a workload relationship for each employee, and we will know which employee works in which office. All this is done with a simple SQL join. In any way we define here an attribute in a relationship (Remember that was supposed during the 1-1 relationship that we would put the attribute on the side that could become N in the future).
There's no one more new table, and there is no longer one more table participating in a SQL join. There is no way you accidentally assign more than one office to an employee. You don't need to create more rules (constraints) for that. This brings us to a very important issue in the development/modeling databases:
To a shapely, well prepared, and well managed database it's virtually impossible for client applications to leave it in a inconsistent state, which could completely compromise the entire system (especially for safety or theft of sensitive data). Client applications can validate data (What is costly in terms of maintenance, if not done properly), but the database is the last line of defense for this (I learned that here in the community).
Anyway, if we never have to change the cardinality between the entities, we need to have discretion. In 1-1 relation created earlier, I/me, as a professional, would have consulted my client and ask how he/she thinks his/her company will grow in the future.
As a final note (as if I had not already talked a lot... lol), I want to say that the notations and the knowledge that I used here about Data Modeling differs somewhat from the way many people perceive in many countries.
When I speak, for example, about 1-N (or N-1) relationships, I am not considering that the 1 side may have cardinality zero. Many people use other notations, such as 0...N, or (0,3), and so on. These notations are not nonsense, they are official and are used to model data in the same way.
I hope I have spent a bit of my experience to someone, and I'm happy with it. We are all professionals, but we are all colleagues and we are here for the same purpose: exchange knowledge. If anyone has any questions, please comment and I will answer. Thanks for listening.
EDIT 2
More interesting questions were raised in the comments. I have to agree that in my first post, the way I put things may seem to some people senseless. But they are not. What I put are rules that I learned during my years of course.
The way I learned here I my country is that within the ER-Diagram we have 3 stages of development for modeling data. These stages are made official by its original author, Peter Chen. Still, it was been seen over the years the need for the existence of a fourth model, known by the name Descriptive Model. This model reveals the problem/context by saying what the situation/case of the real world that needs to be addressed. In the case, what your customer/client needs.
It's need for existence appeared as often more of a professional can work or start working on a project, and he/she must know what he/she is working, not being enough, often, to only look at the final diagram. In summary, this template serves as a starting point, both for the development, as to the understanding of a problem.
Once you have this model in hand, which describes all the customer's needs, you remain faithful to it, and proceeds to the next stage of development, thus starting the Conceptual Model.
For the conceptual model, many professionals (as I have seen) use different techniques and devices for their development. During my studies, I learned from my teachers that the transformation from one situation/problem/context to the Conceptual Model we must follow rules. These are rules which are mentioned in my original post.
For we do not begin to guess, or to draw nonsense.
An entity is identified when we have something in the real world that has attributes/properties, and we have to persist these attributes. Simple entities are generally seen by nouns. But not in every case we have entities with noun names.
Because we can often identify entities this way that do not have attributes, but when we decompose this, we end up with tables without columns, without data being persisted, which often have only foreign primary key.
Note that, often this can be used to create subtypes of things, for example, individuals, legal entities, vehicles, cars, motorcycles, etc. But that, depending on the situation, can be harmful, because the addition of new types implies a change in the model, in the database, the rules of the bank, in the systems that use it, and so on.
This was the reason I mentioned about partitioning entities, as this is a technique for creating types and subtypes of things.
Yes. The model is yours, and it is in your perspective if you should or should not end up with another table.
Imagine that we have a car rental company. Our customers are registered in the system, along with the vehicles. Every client/customer has name, contact, address, etc. Every vehicle has a model, card/plaque/plate (I really don't know the translation of this to you here), year, etc.
We can easily find two entities here:
Imagine that we have a supermarket. Similarly, we have registered customers in this store. However, our supermarket sells to individuals and legal entities.
If you could not identify types here properly, you may end up with problems.
If we have only to persist the type of client we sell, this type end up as an attribute, not an entity. If the types of customers/clients have attributes which some are different, and some are equal, then we have a partitioning. We will have a super "Client" entity, and two dependent entities:
That depends on the context. You need to see:
If the answer is yes to both questions, so be sure that we have a relationship going on here, and we need to represent it in the diagram.
Again, following the example of the car rental company, imagine you have to keep/save/persist/register/remember the rent event made of a customer with a vehicle.
If we have two entities already defined our context now it says that there is an event happening between them. The mentioned event is the rent. Take a look, this is a relationship, and we have to persist in such a way.
This will depend entirely on the context. Instead of the rent event, do you need to persist who licked the car? So this is an event, an relationship (strange, but it is). Note, it is the context who dictates the relationship.
Initially, attributes belong to entities. The attributes represent the characteristics of the entities that need to be persisted. You should not put attributes:
Because attributes can exist in relationships as well, but when they do, they are in N-N relationships.
Again, this is a rule, and this has been explained in my first edition.
Yes.
Yes. All of this is taken into account. Just the fact that you follow these types of rules already makes the database designed to not create and enable/allow structures or failed relationships.
In short, you create only what you need, without allowing for human error, to put something in the wrong way/place.
All this was explained above, in all that has been said. I will not repeat myself here. You just have to read.
In case you need something lighter, ask, show me where you have doubt and I'll leave it clearer.
As I said, I think in my first original response I was not very clear, as commented. This edition should: