I have read that bigint like 23423423423423423637 for primare unique key is better than varchar like 961637593864109_412954765521130 but how big is the difference when there are let's say 1 million rows when I never will sort but only select/update one row. It would be much more comfortable for me to use varchar and I will stay with that when the performance difference is under 30% or anything. I can't find any benchmark for that.
相关问题
- Faster loop: foreach vs some (performance of jsper
- sqlyog export query result as csv
- NOT DISTINCT query in mySQL
- Why wrapping a function into a lambda potentially
- MySQL: conduct a basic search
This would really have to be measured, we can make some "guesses" based on what we know, and what we assume, but those are just guesses.
You don't mention whether this table is InnoDB, or MyISAM with dynamic rows, or MyISAM with fixed length rows. That's going to make some difference.
But for values like the one you posted,
'961637593864109_412954765521130'
(31 characters), assuming you're using a single byte characterset (e.g. latin1), or a characterset that encodes those particular characters into a single byte (e.g. utf8)...For InnoDB and MyISAM dynamic format, that's 31+1-8=24 extra bytes for that row. (BIGINT fits in 8 bytes, a VARCHAR(31) value of 31 characters will use 32 bytes.)
For MyISAM table with fixed length rows, that would be a difference of 23 bytes per row. (Space is reserved for all 31 characters, and the length doesn't have to be stored.)
That primary key value will also be repeated in every index, so there's also increased space with each index.
Assuming that your table rows are 120 bytes using BIGINT, and the rows are 144 bytes with VARCHAR, that's a 20% increase. The larger your rows, the smaller the percentage increase, and vice versa.
For 1,000,000 rows (I so want to say "one meelyun rows" in the same way that Dr. Evil puts his pinky finger to the corner of this mouth and says "one million dollars") that extra 24 bytes per row totals around 24MB.
But it's not really that easy. In terms of InnoDB space, it's a matter of how may rows "fit" into a block. The larger the average row size, the larger the amount of free space will be in a block.
If you don't do anything with the rows except store them on disk, then it's really just an increase in disk space, and extra time and space for backups.
If the same number of "144 byte" rows fit in a block as "120 byte" rows, then you aren't going to see any difference in space. But if fewer rows fit in a block, that's more blocks, more space in the InnoDB buffer pool, more i/o, etc.
For queries of a single row, either by primary key value, or by some other unique index lookup, the difference is going to be negligible.
If you are dealing with larger resultsets, then that's extra memory for preparing the resultset, and extra bytes to transfer to the client, etc.
If the VARCHAR key is designed in such a way that "groups" of rows that are accessed together have the same leading portion of the key value, then with InnoDB, there may actually be some performance improvement. That's because the primary key is the cluster key... much better chance of the rows needed to satisfy a query are in the same block, rather than being spread out over a bunch of blocks.
The converse is if there are inserts and deletes performed, there will be more free space in some blocks. (With the deletes, the space for deleted rows remains in the block; to get that reused, you'd need to insert a row that had the same key value (or at least a key value close enough that it lands in the same block.) And with random inserts, we're going to get block splits.