I'm not an DB expert, so I've been around for a while, reading as much as I can and thanks to the community answers I could make several changes to my query and tables structure. Even after reading a lot of stuff I got stuck, so I came to make my first question.
I have a website where the users post their own stories. Each story can have genres, warnings, multiple authors, multiple characters assigned, etc.
We're running MySQL 5.x, tables are InnoDB, website written in PHP. Using GROUP_CONCAT to return a single row per story on the result. Tried previously using GROUP BY story id, but every query was taking around 16 secs to complete, was very slow. With this new one, they take 0.175, but for example, if the genre in WHERE doesn't exists, the query takes 23 secs! For the tests, every table has 1 million records, authors table with 1.5 million. I've tried to place some extra indexes to see which one MySQL will use.
I've tried to get things normalized with one to many relationship. Here I'll present only a few tables as the solution will probably cover the entire problem. Any help is greatly appreciated, thanks for your time!
Tables
CREATE TABLE `fanfiction_authors` (
`uid` int(11) NOT NULL AUTO_INCREMENT,
`penname` varchar(100) NOT NULL,
`penname_url` varchar(100) NOT NULL,
PRIMARY KEY (`uid`),
KEY `penname_url` (`penname_url`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;
-- --------------------------------------------------------
CREATE TABLE `fanfiction_stories` (
`sid` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(200) NOT NULL,
`sinopse` text NOT NULL,
PRIMARY KEY (`sid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;
-- --------------------------------------------------------
CREATE TABLE `fanfiction_stories_authors` (
`sid` int(11) NOT NULL,
`uid` int(11) NOT NULL,
KEY `sid_uid` (`sid`,`uid`),
KEY `sid` (`sid`),
KEY `uid` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- --------------------------------------------------------
CREATE TABLE `fanfiction_stories_genres` (
`key_id` int(11) NOT NULL AUTO_INCREMENT,
`sid` int(11) NOT NULL,
`genre_id` int(11) NOT NULL,
PRIMARY KEY (`key_id`),
KEY `sid` (`sid`),
KEY `genre_id` (`genre_id`),
KEY `sid_genreid` (`sid`,`genre_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;
-- --------------------------------------------------------
CREATE TABLE `fanfiction_stories_stats` (
`sid` int(11) NOT NULL,
`reviews` int(11) NOT NULL,
`recomendacoes` int(11) NOT NULL,
PRIMARY KEY (`sid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- --------------------------------------------------------
CREATE TABLE `fanfiction_stories_warnings` (
`key_id` int(11) NOT NULL AUTO_INCREMENT,
`sid` int(11) NOT NULL,
`warning_id` int(11) NOT NULL,
PRIMARY KEY (`key_id`),
KEY `sid` (`sid`),
KEY `warning_id` (`warning_id`),
KEY `warningid_sid` (`sid`,`warning_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;
----
Query
SELECT
st.sid, st.title, st.sinopse,
(SELECT GROUP_CONCAT(CAST(genre_id AS CHAR)) FROM fanfiction_stories_genres WHERE sid = st.sid) as genres,
stats.reviews, stats.recomendacoes,
(SELECT GROUP_CONCAT(CAST(warning_id AS CHAR)) FROM fanfiction_stories_warnings WHERE sid = st.sid) as warnings_ids
FROM
fanfiction_stories AS st
LEFT JOIN fanfiction_stories_stats AS stats ON st.sid = stats.sid
WHERE
st.sid IN (SELECT sid FROM fanfiction_stories_warnings WHERE warning_id = 5) AND
st.sid IN (SELECT sid FROM fanfiction_stories_genres WHERE genre_id = 300)
ORDER BY
st.sid ASC
LIMIT 20
I couldn't make my explain readable here, so I uploaded the printscreen to Dropbox. Couldn't embed the image because I'm a noob here, sorry.
This is the explain extended when we have a valid genre (You will find stories with genre number 300).
http://dl.dropbox.com/u/14508898/Printscreen/stackoverflow_explain_print_001.PNG
This is the explain extended when we have a invalid genre (You will NOT find stories with genre number 900).
http://dl.dropbox.com/u/14508898/Printscreen/stackoverflow_explain_print_002.PNG
Can you guys help me, please? Is my normalization correct? What am I doing wrong?
Thanks in advance!
You can save 2 of the inner selects by using JOIN instead, which will surely speed things up in both cases (
genre_id = 300
andgenre_id = 900
).