Drawbacks of storing an integer as a string in a d

2020-05-19 08:48发布

I have id values for products that I need store. Right now they are all integers, but I'm not sure if the data provider in the future will introduce letters or symbols into that mix, so I'm debating whether to store it now as integer or string.

Are there performance or other disadvantages to saving the values as strings?

10条回答
神经病院院长
2楼-- · 2020-05-19 08:53

The space an integer would take up would me much less than a string. For example 2^32-1 = 4,294,967,295. This would take 10 bytes to store, where as the integer would take 4 bytes to store. For a single entry this is not very much space, but when you start in the millions... As many other posts suggest there are several other issues to consider, but this is one drawback of the string representation.

查看更多
forever°为你锁心
3楼-- · 2020-05-19 08:58

Better use independent ID and add string ID if necessary: if there's a business indicator you need to include, why make it system ID?

Main drawbacks:

  1. Integer operations and indexing always show better performance on large scales of data (more than 1k rows in a table, not to speak of connected tables)

  2. You'll have to make additional checks to restrict numeric-only values in a column: these can be regex whether on client or database side. Anyway, you'll have to guarantee somehow that there's actually integer.

  3. And you will create additional context layer for developers to know, and anyway someone will always mess this up :)

查看更多
走好不送
4楼-- · 2020-05-19 09:03

As answered in Integer vs String in database

In my country, post-codes are also always 4 digits. But the first digit can be zero.

If you store "0700" as an integer, you can get a lot of problems:

It may be read as an octal value If it is read correctly as a decimal value, it gets turned into "700" When you get the value "700", you must remember to add the zero I you don't add the zero, later on, how will you know if "700" is "0700", or someone mistyped "7100"? Technically, our post codes is actual strings, even if it is always 4 digits.

You can store them as integers, to save space. But remember this is a simple DB-trick, and be careful about leading zeroes.

But what about for storing how many files are in a torrent? Integer or string?

That's clearly an integer.

If the ID would ever start with zero, store it as in interger.

查看更多
虎瘦雄心在
5楼-- · 2020-05-19 09:04

Unless you really need the features of an integer (that is, the ability to do arithmetic), then it is probably better for you to store the product IDs as strings. You will never need to do anything like add two product IDs together, or compute the average of a group of product IDs, so there is no need for an actual numeric type.

It is unlikely that storing product IDs as strings will cause a measurable difference in performance. While there will be a slight increase in storage size, the size of a product ID string is likely to be much smaller than the data in the rest of your database row anyway.

Storing product IDs as strings today will save you much pain in the future if the data provider decides to start using alphabetic or symbol characters. There is no real downside.

查看更多
倾城 Initia
6楼-- · 2020-05-19 09:05

I've just spent the last year dealing with a database that has almost all IDs as strings, some with digits only, and others mixed. These are the problems:

  1. Grossly restricted ID space. A 4 char (digit-only) ID has capacity for 10,000 unique values. A 4 byte numeric has capacity for over 4 billion.
  2. Unpredictable ID space coverage. Once IDs start including non-digits it becomes hard to predict where you can create new IDs without collisions.
  3. Conversion and display problems in certain circumstances, when scripting or on export for instance. If the ID gets interpreted as a number and there is a leading zero, the ID gets altered.
  4. Sorting problems. You can't rely on the natural order being helpful.

Of course, if you run out of IDs, or don't know how to create new IDs, your app is dead. I suggest that if you can't control the format of your incoming IDs then you need to create your own (numeric) IDs and relate the user provided ID to that. You can then ensure that your own ID is reliable and unique (and numeric) but provide a user-viewable ID that can have whatever format your users want, and doesn't even have to be unique across the whole app. This is more work, but if you'd been through what I have you'd know which way to go.

Anil G

查看更多
狗以群分
7楼-- · 2020-05-19 09:08

It really depends on what kind of id you are talking about. If it's a code like a phone number it would actually be better to use a varchar for the id and then have your own id to be a serial for the db and use for primary key. In a case where the integer have no numerical value, varchars are generally prefered.

查看更多
登录 后发表回答