Why is my table size more than 4x larger than expe

2019-02-22 07:31发布

问题:

I'm looking at a simple table in MySQL that has 4 columns with the following sizes,

unsigned bigint (8 bytes)
unsigned bigint (8 bytes)
unsigned smallint (2 bytes)
unsigned tinyint (1 byte)

So I would expect 19 bytes/row.

There are 1,654,150 rows in this table so the size of the data should be 31,428,850 bytes (or about 30 megabytes).

But I can see via phpMyAdmin that the data is taking up 136.3 MiB (not including the size of the Index on bigint 1, smallint, tinyint which is 79 MiB).

Storage Engine is InnoDB and Primary Key is bigint 1, bigint 2 (a user ID and a unique item id).


Edit: As requested in the comments, here is the result of a SHOW CREATE TABLE storage

CREATE TABLE `storage` (
 `fbid` bigint(20) unsigned NOT NULL,
 `unique_id` bigint(20) unsigned NOT NULL,
 `collection_id` smallint(5) unsigned NOT NULL,
 `egg_id` tinyint(3) unsigned NOT NULL,
 PRIMARY KEY (`fbid`,`unique_id`),
 KEY `fbid` (`fbid`,`collection_id`,`egg_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

回答1:

If the table is frequently doing insert/delete/update, you may want to try run OPTIMIZE TABLE query to see how much the table can get shrink. there may be defragmentations and unused spaces in the data file.

The data size that phpmyadmin shows you won't be what you expected here. You will see when you create the table first time, it won't show data usage : 0. It will be 16KB or 32KB or something. And the size won't change as you insert records. That's just how innoDB controls the table file as efficient as it thinks.

Check SHOW TABLE STATUS FROM {db_name} and see how much of Avg_row_length each row of the table is. It won't be 19 bytes either



回答2:

Your indexes have there own tables on disk (though you can't directly 'see' them). The total size of your db is the size of your table and index tables.

Run

show create table <tablename>;

You can see any indexes defined. Imagine adding the total size of your table and a table consisting of the two columns in your primary key. Those added, will get you the size you're seeing.



回答3:

The data size for InnoDB on disk is typically 2-3 times as big as you would compute. This is due to

  • Overhead per column (length, offset into record)
  • Overhead per row (tx id, etc)
  • Overhead per block (16KB) (link to next block -- B+Tree)
  • BTree averages 69% full
  • MVCC -- Multiple Version Concurrency Control. That means that there can be old and new copies of any row coexisting simultaneously during a transaction
  • Etc.

One thing that would help: Almost no application needs BIGINT (8 bytes) for ids. Consider INT UNSIGNED (4 bytes, 4B limit) or MEDIUMINT UNSIGNED (3 bytes, 16M limit), etc. You have 2 Bigints, but 4 copies of them -- the secondary key implicitly includes the PK columns.

The PRIMARY KEY is stored with the data, so it incurs very little overhead. The secondary key, which is effectively 4 columns, is a BTree with a similar set of overheads.

Even in MyISAM, there is overhead:

  • At least 1 byte per row. (1 in your case)
  • 1 byte per 8 NULLable columns (none in your case)
  • Some amount of lost space after rows are DELETEd or UPDATEd. (Updated won't be a problem in your case, due to FIXED record size.)
  • The PRIMARY KEY is just like any other index
  • All keys have the 69% issue; blocks are 1KB

(Since you have no VARCHAR or TEXT, I don't need to discuss the `CHARACTER SET issues.)

In InnoDB, SHOW TABLE STATUS is often off by a factor of 2 in the estimate of the number of rows. The Avg_row_length is computed as Data_length / Rows, so it is usually off.

I do not recommend OPTIMIZE TABLE for InnoDB tables; it is almost always not worth the effort.

When doing ALTER TABLE .. ADD INDEX .., older versions of MySQL would rebuild the entire table and indexes. In doing so, you get the effect of OPTIMIZE. (It is unlikely, but not impossible, for the Data size to increase.) Newer versions only add on the new index. What version are you running?

Each INDEX is a separate BTree (except for the PK in InnoDB) (and except for FULLTEXT and SPATIAL).