SQL - multiple tables vs one big table

2019-09-13 11:36发布

问题:

I want to move multiple SQLite files to PostgreSQL.

Data contained in these files are monthly time-series (one month in a single *.sqlite file). Each has about 300,000 rows. There are more than 20 of these files.

My dilemma is how to organize the data in the new database:

a) Keep it in multiple tables
or
b) Merge it to one huge table with new column describing the time period (e.g. 04.2016, 05.2016, ...)

The database will be used only to pull data out of it (with the exception of adding data for new month).

My concern is that selecting data from multiple tables (join) would not perform very well and the queries can get quite complicated.

Which structure should I go for - one huge table or multiple smaller tables?

回答1:

Think I would definitely go for one table - just make sure you use sensible indexes.



回答2:

If you have the space and the resource 1 table, as other users have appropriately pointed out databases can handle millions of rows no problem.....Well depends on the data that is in them. The row size can make a big difference... Such as storing VARCHAR(MAX), VARBINARY(MAX) and several per row......

there is no doubt writing queries, ETL (extract transform load) is significantly easier on a single table! And maintenance of that is easier too from a archival perspective.

But if you never access the data and you need the performance in the primary table some sort of archive might make since.

There are some BI related reasons to maintain multiple tables but it doesn't sound like that is your issue here.

There is no perfect answer and will depend on your situation.



回答3:

PostgreSQL is easily able to handle millions of rows in a table.

Go for option b) but..

with new column describing the time period (e.g. 04.2016, 05/2016, ...)

Please don't. Querying the different periods will become a pain, an unnecessary one. Just put the date in one column, put a index on the column and you can, probably, execute fast queries on it.

My concern is that selecting data from multiple tables (join) would not perform very well and the queries can get quite complicated.

Complicated for you to write or for the database to execute? An Example would be nice for us to get an image of your actual requirements.