Use timestamp(or datetime) as part of primary key

2019-02-14 11:06发布

问题:

I use following query frequently:

SELECT * FROM table WHERE Timestamp > [SomeTime] AND Timestamp < [SomeOtherTime] and publish = 1 and type = 2 order by Timestamp

I would like to optimize this query, and I am thinking about put timestamp as part of primary key for clustered index, I think if timestamp is part of primary key , data inserted in table has write to disk sequentially by timestamp field.Also I think this improve my query a lot, but am not sure if this would help.

table has 3-4 million+ rows.
timestamp field never changed.
I use mysql 5.6.11

Anothet point is : if this is improve my query , it is better to use timestamp(4 byte in mysql 5.6) or datetime(5 byte in mysql 5.6)?

回答1:

Four million rows isn't huge.

A one-byte difference between the data types datetime and timestamp is the last thing you should consider in choosing between those two data types. Review their specs.

Making a timestamp part of your primary key is a bad, bad idea. Think about reviewing what primary key means in a SQL database.

Put an index on your timestamp column. Get an execution plan, and paste that into your question. Determine your median query performance, and paste that into your question, too.

Returning a single day's rows from an indexed, 4 million row table on my desktop computer takes 2ms. (It returns around 8000 rows.)



回答2:

1) If values of timestamp are unique you can make it primary key. If not, anyway create index on timestamp column as you frequently use it in "where".

2) using BETWEEN clause looks more natural here. I suggest you use TREE index (default index type) not HASH.

3) when timestamp column is indexed, you don't need call order by - it already sorted. (of course, if your index is TREE not HASH).

4) integer unix_timestamp is better than datetime both from memory usage side and performance side - comparing dates is more complex operation than comparing integer numbers.

Searching data on indexed field takes O(log(rows)) tree lookups. Comparison of integers is O(1) and comparison of dates is O(date_string_length). So, difference is (number of tree lookups) * (difference_comparison) = O(date_string_length)/O(1))* O(log(rows)) = O(date_string_length)* O(log(rows))