hope you all had a happy new year.
So, my question is, what's the best way to make a log of actions. Let me explain it with a example, suppose we have these entities:
User
Friend
(User is a friend of another User, many to many relationship)
Message
(An user can message another user)
Group
(An user can be in various groups)
Game
(A game can be played with various players, has some info like date of the game. this results in two tales, games and games_users, the latter stores a relationship between user and a game)
Now, I wanted to make a log, for example:
User A (link to user) made a new friend, User B (link to user)
User A (link to user), B (link to user) and C (link to user) played a game (link to game)
User C (link to user) joined a group D (link to group)
So, I wanted to make a flexible log, that could store as many references as I wanted and references to different entities (user and game for example).
I know two ways of doing this, but they all have one or more problems:
When logging an action I directly store the pure text I want (i.e: only 1 char field, which would store 'User C joined a group'). But, there is a problem this way, this text needs to be translated to other languages and I can not have a field for each language.
Having a main table
log
, which each rows represent a log action and a code so I know which action is that, i.e: an user joined a group, x users played a game. I then have another table for each of the foreign key types needed, so I'd havelog_user
,log_group
andlog_game
For example,log_user
with a field referencinglog
and another referencinguser
. This way I can have multiple users for a same log action. Problems: rather complex and could result in substantial overhead as depending of the log action I'd have to query to multiple tables. Is this correct, would it be too cpu-intensive?
So, I'm open to new ideas and brainstorming. What's the best approach for this kind of problem? Thanks in advance, I hope I have explained it in a clear way. If there is any question, please ask.
Edit: I decided to start a bounty as I'm not really happy with the answers I have received. Will make any clarifications if needed. Thanks
I want something very similar to facebook/orkut/social networks "friend updates". This will be displayed to users.
The following is how I would do it. I have some more comments at the bottom after you have seen the schema.
Log
LogID - unique log ID
Time - date/time of event
LogType - String or ID
(side comment, I would go with an id here so you can use a message table shown below, but if you want quick n dirty you can just just a unique string for each log time (eg "Game Started", "Message Sent", etc)
LogActor
LogID - external key
LogActorType - String or ID (as above, if ID you will need a lookup table)
LogActorID - This is a unique id to the table for the type eg User, Group, Game
Sequence - this is an ordering of the actors.
LogMessage
LogType - exernal key
Message - long string (varchar(max)?)
Language - string(5) so you can key off different language eg "US-en"
Example Data (using your 3 examples)
Log
LogActor
LogMessage
User
Game
Group
So here are the nice things about this design.
It is very easy to extend
It handles multi-language issues independent of the actors
It is self documenting, the LogMessage table explains exactly what the data you are storing should say.
Some bad things about it.
You have to do some complicated processing to read the messages.
You can't just look at the DB and see what has happened.
In my experience the good parts of this kind of a design outweigh the bad bits. What I have done to allow me to do a quick n dirty look at the log is make a view (which I don't use for the application code) that I can look at when I need to see what is going on via the back end.
Let me know if you have questions.
Update - Some example queries
All of my examples are in sqlserver 2005+, let me know if there is a different version you want me to target.
View the LogActor table (There are a number of ways to do this, the best depends on many things including data distribution, use cases, etc) Here are two:
a)
b)
In general I think a) is better than b) For example if you are missing an actor type a) will include it (with a null name). However b) is easier to maintain (because the UNION ALL statements make it more modular.) There are other ways to do this (eg CTE, views, etc). I'm inclined to doing it like b) and from what I've seen that seems to be at least standard practice if not best practice.
So, the last 10 items in the log would looks something like this:
NB - As you can see, it is easier to select all log items from a date than the last X, because we need a (probably very fast) sub-query for this.
Do you need this for logging/tracking purposes, or for display to users and admin? If your use for logging/tracking (i.e. computer readable), you should probably separate your logging into multiple tables like you specified.
However, if you want this for your users or display on screen, why not just store it in basic html? This way you can easily display it on screen and view.
For example, "User A (link to user), B (link to user) and C (link to user) played a game (link to game)" would be
Suggestion:
(In the above table, "Message", ,"Sender", "John", "game43" etc. would not be text but would be foreign keys in either the action, role, or entity table. I've written the keys for "Action" but not for "Role" or "Entity" but they would be keys as well.
Now, instead of text action, Role, Entity you might have keys in there, and store them in a separate table. This can be used for output, e.g.
Note that in the entity table, 2 or more entries might have the same text representation, as there could be more than 1 user named Sam. If you want to represent different, orthogonal information about each entity, then you can include the EntityKey in the correspondig table, e.g.
Basically we are mapping a predicate (in the first-order predicate logic sense, a set of tuples) to binary relation form (by creating an artificial entity, the action key). Thus, the Entity table basically contains various columns/arguments of the relation, and so it can be practically anything which could be an argument of the relation. The advantage of this representation is it is infinitely extendible to new relations that you may wish to log without changing the schema. Everything in the ActionTable should be a key or foreign key, I just haven't put that in there because it may be harder to read.
My answer to Whats a better strategy for storing log data in a database?:
Edit: To implement it cleanly with referential integrity and have all the flexibility, I suggest having duplicate audit trail table for all CRUDs for each table even if it's "heavy." The business rules are more volatile compared to data structures anyway, so by keeping log logic in code/query you retain the flexibility. For example, suppose you decided not to track when the users left a group. Later day, the clients asked that it's very important to track the information. All you have to do now is change the query so deletion of
user_group
record is part of the result.