I have id values for products that I need store. Right now they are all integers, but I'm not sure if the data provider in the future will introduce letters or symbols into that mix, so I'm debating whether to store it now as integer or string.
Are there performance or other disadvantages to saving the values as strings?
The space an integer would take up would me much less than a string. For example 2^32-1 = 4,294,967,295. This would take 10 bytes to store, where as the integer would take 4 bytes to store. For a single entry this is not very much space, but when you start in the millions... As many other posts suggest there are several other issues to consider, but this is one drawback of the string representation.
Better use independent ID and add string ID if necessary: if there's a business indicator you need to include, why make it system ID?
Main drawbacks:
Integer operations and indexing always show better performance on large scales of data (more than 1k rows in a table, not to speak of connected tables)
You'll have to make additional checks to restrict numeric-only values in a column: these can be regex whether on client or database side. Anyway, you'll have to guarantee somehow that there's actually integer.
And you will create additional context layer for developers to know, and anyway someone will always mess this up :)
As answered in Integer vs String in database
In my country, post-codes are also always 4 digits. But the first digit can be zero.
If the ID would ever start with zero, store it as in interger.
Unless you really need the features of an integer (that is, the ability to do arithmetic), then it is probably better for you to store the product IDs as strings. You will never need to do anything like add two product IDs together, or compute the average of a group of product IDs, so there is no need for an actual numeric type.
It is unlikely that storing product IDs as strings will cause a measurable difference in performance. While there will be a slight increase in storage size, the size of a product ID string is likely to be much smaller than the data in the rest of your database row anyway.
Storing product IDs as strings today will save you much pain in the future if the data provider decides to start using alphabetic or symbol characters. There is no real downside.
I've just spent the last year dealing with a database that has almost all IDs as strings, some with digits only, and others mixed. These are the problems:
Of course, if you run out of IDs, or don't know how to create new IDs, your app is dead. I suggest that if you can't control the format of your incoming IDs then you need to create your own (numeric) IDs and relate the user provided ID to that. You can then ensure that your own ID is reliable and unique (and numeric) but provide a user-viewable ID that can have whatever format your users want, and doesn't even have to be unique across the whole app. This is more work, but if you'd been through what I have you'd know which way to go.
Anil G
It really depends on what kind of id you are talking about. If it's a code like a phone number it would actually be better to use a varchar for the id and then have your own id to be a serial for the db and use for primary key. In a case where the integer have no numerical value, varchars are generally prefered.