In awstats I get a table with all the key words and phrases used to find my website. I would like to capture this myself however each search engine url is in a different format. When google is the referer I can use the variable q from the querystring as the search term (e.g. google.com?q=my+keywords) however another search engine may have the format searchengine.com?search=my+keywords
Is there a generic way of identifying search keywords? Or am I going to have to create a regex/filter for each search engine?
One possibility is to just grab the referring URL ($_SERVER['HTTP_REFERER']
) and parse out the keywords in it.
For example, check out this Google URL (searching for "stack overflow"):
http://www.google.com/search?hl=en&q=stack+overflow&aq=0&oq=stack+over&aqi=g10
The value of the q
GET variable holds the keywords delimited by + signs.
I am having to add to it all the time, but here is a REGEX that should strip out the keywords from google, yahoo, bing, ask, and MSN (same as Bing). It leaves the + in between, but it should be a good place for you to start:
.*(\?p=|\?q=|&q=|\?s=)([a-zA-Z0-9 +]*)(&toggle=|&ie=utf-8|&FORM=|&aq=|&x=|&gwp).