Can you help me with index my tables?
Problem is that i indexed my tables, but i still have "full table scan" in my explain
this is my (working) query, but on big tables it could be slow, and i dont know how to change this
EXPLAIN select * from stats_clicked s
join visitor v on s.visitor_id=v.id
ps. index3 - I dont wan't many times values (1,5) when visitor=1 refresh page with id=5
CREATE TABLE `visitor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`visited_time` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `stats_clicked` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`visitor_id` int(11) NOT NULL,
`page_clicked_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index3` (`visitor_id`,`page_clicked_id`),
KEY `index1` (`visitor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
insert into visitor (`visited_time`) values
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944);
insert into `stats_clicked` ( `visitor_id`,`page_clicked_id`) values
(1,47),(2,24),(3,83),(3,8),(3,85),(3,88),(4,57),
(5,2),(6,1),(7,28),(8,83),(9,11),(9,16),(9,1),(10,17),
(11,70),(12,73),(13,97),(14,57),(15,30),(15,2),(15,22);
If I perform what you did above, I get
EXPLAIN select * from stats_clicked s
join visitor v on s.visitor_id=v.id
+----+-------------+-------+------+---------------+--------+---------+--------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+--------------------+------+-------------+
| 1 | SIMPLE | v | ALL | PRIMARY | NULL | NULL | NULL | 15 | NULL |
| 1 | SIMPLE | s | ref | index3,index1 | index3 | 4 | so_gibberish2.v.id | 1 | Using index |
+----+-------------+-------+------+---------------+--------+---------+--------------------+------+-------------+
However if I truncate then do the following load of a lot of data (ending up with over 100K rows):
truncate table visitor;
insert into visitor (`visited_time`) values
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944),
(1467122944),(1467122944),(1467122944);
insert into visitor (`visited_time`) values
(1467122945),(1467122945),(1467122945),
(1467122945),(1467122945),(1467122945),
(1467122945),(1467122945),(1467122945),
(1467122945),(1467122945),(1467122945),
(1467122945),(1467122945),(1467122945),
insert into visitor (`visited_time`) values
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946),
(1467122946),(1467122946),(1467122946);
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
insert visitor(`visited_time`) select `visited_time` from visitor;
select count(*) from visitor;
-- 104448 rows
This results in NOT a table scan:
EXPLAIN select * from stats_clicked s
join visitor v on s.visitor_id=v.id;
+----+-------------+-------+--------+---------------+---------+---------+----------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------+------+-------------+
| 1 | SIMPLE | s | index | index3,index1 | index3 | 9 | NULL | 22 | Using index |
| 1 | SIMPLE | v | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish2.s.visitor_id | 1 | NULL |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------+------+-------------+
The reason is listed in the Manual Page How MySQL Uses Indexes:
Indexes are less
important for queries on small tables, or big tables where report
queries process most or all of the rows. When a query needs to access
most of the rows, reading sequentially is faster than working through
an index. Sequential reads minimize disk seeks, even if not all the
rows are needed for the query.
The reason being listed above. In your question's example, you had too few rows for making the index use worth it. So the db engine chose its allegedly (and probably) faster way of not using the index on your small table.
stats_clicked
does not need id
. In fact, its existence slows down all likely queries with that table. More details.