Large Number of columns, Select all takes forever

2019-08-23 18:51发布

问题:

There are 210 columns in my table with around 10000 rows. Each row is unique and there is a primary key on the table. The thing is we always had to do select all query on the table to get data of all the sites.

Currently, the problem is it takes too much time and the data returned is around 10mb and it will be large in the future.

The table has varchar, text and date types in it.

Is there any way I can modify the structure or something to make my retrieval faster. More indexing or breaking down the table. (Although I think denormalized data is good for retrieval)

Update: "why do wider tables slow down the query performance?"

Thanks..!

回答1:

why do wider tables slow down the query performance?

InnoDB stores "wide" tables in a different way. Instead of having all the columns together in a single string (plus overhead, such as lengths, etc), it does the following:

  • If the total of all the columns for a given row exceeds about 8KB, it will move some of the data to another ("off-record") storage area.
  • Which columns are moved off-record depends on the sizes of the columns, etc.
  • The details depend on the ROW_FORMAT chosen.
  • "Off-record" is another 16KB block (or blocks).
  • Later, when doing SELECT * (or at least fetching the off-record column(s)), it must do another disk fetch.

What to do?

  • Rethink having so many columns.
  • Consider "vertical partitioning", wherein you have another table(s) that contains selected TEXT columns. Suggest picking groups of columns based on access patterns in your app.
  • For columns that are usually quite long, consider compressing them in the client and storing into a BLOB instead of a TEXT. Most "text" shrinks 3:1. Blobs are sent off-record the same as Texts, however, these compressed blobs would be smaller, hence less likely to spill.
  • Do more processing in SQL -- to avoid returning all the rows, or to avoid returning the full text, etc. When blindly shoveling lots of text to a client, the network and client become a sighificant factor in the elapsed time, not just the SELECT, itself.