I'm concern about performance, engineering and readability. Let's say I have a blog, and every post has its status: published (4), pending review (2), draft (1). What is the recommended to store these information in the status
column?
status <======= storing status as string
========
pending
published
draft
status <======= storing status as integer
========
2
4
1
Also, if we should store integer, should we refrain from storing running integer: 1, 2, 3, 4, 5
, as opposed to storing a ^2 integer: 2, 4, 8, 16, 32
?
Many thanks.
I think the option you choose should depend on how well the tools/frameworks you use work with each feature.
Many database/ORMs deal poorly with enums, requiring custom code (don't understand the concept of "enumerated type").
That said... probably I'd use strings.
Strings:
Strings are also the choice of some well known CMSs (e.g. Drupal 7).
Of course this is a late answer but it could be useful to other readers.
Storing data in the integer form is always more reliable than the character or string.
Create two tables such as blog_status and blog_details
In the blog_status maintain the master status of blog like you said draft, pending and publish Table structure of blog_status
And then create another table where you want to use the blog_status in this way, you can always improve reuse able and performance of your application
There is no point of use the x^2 expression or formula. I hope, I have clear your doubt . If you find the answer helpful please mark it as your answer else let me know...
Storing as a string:
Ideally, you should use an enum type for this sort of thing, if your database supports it.
The database theorist in me thinks that you shouldn't use lookup tables for single column attributes because it leads to unnecessary splitting of your data; in other words, you don't need to have a table with two columns (and ID value and an attribute name). However, the DBA in me thinks that for performance reasons, splitting your data is a very valid technique. Indexing, disk footprints, and updates become very easy when using lookups.
I'd probably split it.
I think your best bet for faster performance, less storage space, and readability is to use CHAR(1)--(p)ublished, pending (r)eview, and (d)raft. You can validate that data with either a CHECK constraint or a foreign key reference.
CHAR(1) takes substantially less space than an integer. It's directly readable by humans, so it doesn't need a join to understand it. Since it's both smaller and immediately readable, you'll get faster retrieval than a join on an integer even on a table of tens of millions of rows.