I have a site with roughly 2000 visitors per day, and the site is spammed with the various search engine bots. I tried reducing the session expire time to 20 minutes, and still I get alot of mysql_slow_queries. So I was looking into the article, Google crawler, cron and codeigniter sessions, to fully ignore the bots from the sessions table, but the the way they do is, ignore the IPs
, but as I was analyzing the database I see that the same bot uses different IPs
. I noticed that the bots use the same user agent everytime though, so is it safe to ignore the user agents instead? What could be some of the necessary steps to avoid slow queries and ignore the bots?
Some of the SLOW Queries
INSERT INTO `ci_sessions` (`session_id`, `ip_address`, `user_agent`, `last_activity`, `user_data`) VALUES ('619bfd8ef4171480645feb17a15323ee', '219.92.135.144', 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15', 1384875135, '')
INSERT INTO `ci_sessions` (`session_id`, `ip_address`, `user_agent`, `last_activity`, `user_data`) VALUES ('fa48b5168b8e84d90dc9b87ce65dfc89', '66.249.74.112', 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', 1384875522, '')
If you want to block the bots completely, maybe try using
robots.txt
? If you only want to ignore the sessions being created, then checking user-agent for strings like "GoogleBot" may be a good solution. But you would need to extend theSession
class to do that, I think.Edit your
user_agent.php
in/config
, and add the bots you see in your session, adding them to the bot section should eliminate the sessions from logging.You can reduce the # of bots, but won't eliminate them. This user-agent process could be used to create a
MY_session.php
and then exclude session creation for agents matching bots.EDIT:
I went ahead and created this on github and documented here:
http://blog.biernacki.ca/2014/01/codeigniter-keeping-bots-out-of-your-sessions-table-or-how-i-cleaned-up-my-sessions/
Enjoy