Difference between storing integer or string in da

2020-07-10 05:27发布

I'm concern about performance, engineering and readability. Let's say I have a blog, and every post has its status: published (4), pending review (2), draft (1). What is the recommended to store these information in the status column?

status        <======= storing status as string
========
pending
published
draft

status        <======= storing status as integer
========
2
4
1

Also, if we should store integer, should we refrain from storing running integer: 1, 2, 3, 4, 5, as opposed to storing a ^2 integer: 2, 4, 8, 16, 32?

Many thanks.

5条回答
Deceive 欺骗
2楼-- · 2020-07-10 05:56

I think the option you choose should depend on how well the tools/frameworks you use work with each feature.

Many database/ORMs deal poorly with enums, requiring custom code (don't understand the concept of "enumerated type").

That said... probably I'd use strings.

Strings:

  • use more space but in your case names are short and you can easily read a data-dump without the enum-table legend. Nowadays, for a blog / CMS, storage is hardly a issue
  • performance differences are usually small
  • you cannot easily rearrange the members of enum-tables (you've to force the "original" integer values).

Strings are also the choice of some well known CMSs (e.g. Drupal 7).


Of course this is a late answer but it could be useful to other readers.

查看更多
淡お忘
3楼-- · 2020-07-10 06:03

Storing data in the integer form is always more reliable than the character or string.

Create two tables such as blog_status and blog_details

In the blog_status maintain the master status of blog like you said draft, pending and publish Table structure of blog_status

Create table blog_status
(
blogstatus_id int,
blogstatus_desc varchar(10),
primary key(blogstatus_id)
)

And then create another table where you want to use the blog_status in this way, you can always improve reuse able and performance of your application

Create table blog_details
(
  blog_id int,
  blog_title varchar(10),
  blog_postingdate datetime,
  blog_postbox varchar(max),
  blog_status int, ---------------------> This should be your blogstatus_id value
  primary key(blog_id)
)

There is no point of use the x^2 expression or formula. I hope, I have clear your doubt . If you find the answer helpful please mark it as your answer else let me know...

查看更多
Luminary・发光体
4楼-- · 2020-07-10 06:06

Storing as a string:

  • wastes space
  • takes longer to read/write
  • is more difficult to index/search
  • makes it more difficult to guarantee validity (there's nothing to prevent someone inserting arbitrary strings)

Ideally, you should use an enum type for this sort of thing, if your database supports it.

查看更多
趁早两清
5楼-- · 2020-07-10 06:10

The database theorist in me thinks that you shouldn't use lookup tables for single column attributes because it leads to unnecessary splitting of your data; in other words, you don't need to have a table with two columns (and ID value and an attribute name). However, the DBA in me thinks that for performance reasons, splitting your data is a very valid technique. Indexing, disk footprints, and updates become very easy when using lookups.

I'd probably split it.

查看更多
Deceive 欺骗
6楼-- · 2020-07-10 06:11

I think your best bet for faster performance, less storage space, and readability is to use CHAR(1)--(p)ublished, pending (r)eview, and (d)raft. You can validate that data with either a CHECK constraint or a foreign key reference.

CHAR(1) takes substantially less space than an integer. It's directly readable by humans, so it doesn't need a join to understand it. Since it's both smaller and immediately readable, you'll get faster retrieval than a join on an integer even on a table of tens of millions of rows.

查看更多
登录 后发表回答