Problems to optimize large query and tables struct

2019-07-25 11:55发布

I'm not an DB expert, so I've been around for a while, reading as much as I can and thanks to the community answers I could make several changes to my query and tables structure. Even after reading a lot of stuff I got stuck, so I came to make my first question.

I have a website where the users post their own stories. Each story can have genres, warnings, multiple authors, multiple characters assigned, etc.

We're running MySQL 5.x, tables are InnoDB, website written in PHP. Using GROUP_CONCAT to return a single row per story on the result. Tried previously using GROUP BY story id, but every query was taking around 16 secs to complete, was very slow. With this new one, they take 0.175, but for example, if the genre in WHERE doesn't exists, the query takes 23 secs! For the tests, every table has 1 million records, authors table with 1.5 million. I've tried to place some extra indexes to see which one MySQL will use.

I've tried to get things normalized with one to many relationship. Here I'll present only a few tables as the solution will probably cover the entire problem. Any help is greatly appreciated, thanks for your time!

Tables

CREATE TABLE `fanfiction_authors` (
  `uid` int(11) NOT NULL AUTO_INCREMENT,
  `penname` varchar(100) NOT NULL,
  `penname_url` varchar(100) NOT NULL,
  PRIMARY KEY (`uid`),
  KEY `penname_url` (`penname_url`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;

-- --------------------------------------------------------

CREATE TABLE `fanfiction_stories` (
  `sid` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(200) NOT NULL,
  `sinopse` text NOT NULL,
  PRIMARY KEY (`sid`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;

-- --------------------------------------------------------

CREATE TABLE `fanfiction_stories_authors` (
  `sid` int(11) NOT NULL,
  `uid` int(11) NOT NULL,
  KEY `sid_uid` (`sid`,`uid`),
  KEY `sid` (`sid`),
  KEY `uid` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

-- --------------------------------------------------------

CREATE TABLE `fanfiction_stories_genres` (
  `key_id` int(11) NOT NULL AUTO_INCREMENT,
  `sid` int(11) NOT NULL,
  `genre_id` int(11) NOT NULL,
  PRIMARY KEY (`key_id`),
  KEY `sid` (`sid`),
  KEY `genre_id` (`genre_id`),
  KEY `sid_genreid` (`sid`,`genre_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;

-- --------------------------------------------------------

CREATE TABLE `fanfiction_stories_stats` (
  `sid` int(11) NOT NULL,
  `reviews` int(11) NOT NULL,
  `recomendacoes` int(11) NOT NULL,
  PRIMARY KEY (`sid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

-- --------------------------------------------------------

CREATE TABLE `fanfiction_stories_warnings` (
  `key_id` int(11) NOT NULL AUTO_INCREMENT,
  `sid` int(11) NOT NULL,
  `warning_id` int(11) NOT NULL,
  PRIMARY KEY (`key_id`),
  KEY `sid` (`sid`),
  KEY `warning_id` (`warning_id`),
  KEY `warningid_sid` (`sid`,`warning_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1000000 ;

----

Query

SELECT  
    st.sid, st.title, st.sinopse,  
    (SELECT GROUP_CONCAT(CAST(genre_id AS CHAR)) FROM fanfiction_stories_genres WHERE sid = st.sid) as genres,  
    stats.reviews, stats.recomendacoes,  
    (SELECT GROUP_CONCAT(CAST(warning_id AS CHAR)) FROM fanfiction_stories_warnings WHERE sid = st.sid) as warnings_ids  
FROM  
    fanfiction_stories AS st  
    LEFT JOIN fanfiction_stories_stats AS stats ON st.sid = stats.sid  
WHERE  
    st.sid IN (SELECT sid FROM fanfiction_stories_warnings WHERE warning_id = 5) AND  
    st.sid IN (SELECT sid FROM fanfiction_stories_genres WHERE genre_id = 300)  
ORDER BY  
    st.sid ASC  
LIMIT 20  

I couldn't make my explain readable here, so I uploaded the printscreen to Dropbox. Couldn't embed the image because I'm a noob here, sorry.

This is the explain extended when we have a valid genre (You will find stories with genre number 300).

explain http://dl.dropbox.com/u/14508898/Printscreen/stackoverflow_explain_print_001.PNG

This is the explain extended when we have a invalid genre (You will NOT find stories with genre number 900).

invalid genre explain http://dl.dropbox.com/u/14508898/Printscreen/stackoverflow_explain_print_002.PNG

Can you guys help me, please? Is my normalization correct? What am I doing wrong?

Thanks in advance!

1条回答
不美不萌又怎样
2楼-- · 2019-07-25 12:24

You can save 2 of the inner selects by using JOIN instead, which will surely speed things up in both cases (genre_id = 300 and genre_id = 900).

SELECT  
    st.sid, st.title, st.sinopse,  
    (SELECT GROUP_CONCAT(CAST(genre_id AS CHAR)) FROM fanfiction_stories_genres WHERE sid = st.sid) as genres,  
    stats.reviews, stats.recomendacoes,  
    (SELECT GROUP_CONCAT(CAST(warning_id AS CHAR)) FROM fanfiction_stories_warnings WHERE sid = st.sid) as warnings_ids  
FROM  
    fanfiction_stories AS st  
    LEFT JOIN fanfiction_stories_stats AS stats ON st.sid = stats.sid  
    JOIN fanfiction_stories_warnings w ON st.sid = w.sid AND w.warning_id = 5
    JOIN fanfiction_stories_genres g ON st.sid = g.sid AND g.genre_id = 300
GROUP BY st.sid
ORDER BY st.sid ASC  
LIMIT 20  
查看更多
登录 后发表回答