How to Monitor Genuine Page Hits

2019-05-26 18:54发布

问题:

I am trying to monitor genuine page hits. Here is what my site does. I have an article directory where people can post articles. When their article is posted they are paid depending on the amount of unique users visit their pages. So page hits are important. Here is the problem I am facing.

What I need:

  • I don't want to track page hits by minor search engines or robots.
  • I would like the major 4 search engines to surf my site because I can monitor them by IP address and not count their visit as a page hit. This cannot be done for spam bots because they do a good job of passing as a real human or major search engine.

Problems:

  • There are spam bots on the internet that do not honor the robot.txt file
  • There are bots that try to fake being a real human user. By manipulating the user agent and other things in the header.
  • Performance may suffer by always checking the database for good IP addresses
  • A human being can bypass the captha only to allow their robot to view my pages

Possible solutions:

  • Require a captcha on every page. If the captcha passes. then log the IP address as good or submit a cookie on the users machine indicating they passed.
  • Allow all major search engines IP address, so they will not be presented with a captcha
  • Purchase a bot detection software
  • Require the viewer to pass a captca every 7 days

Getting accurate human page views is critical for this site to work properly. Do you guys have any other ideas

回答1:

You could just leave it to Google Analytics. It does a very good job solving the kind of problem you're trying to solve and it's free.



回答2:

Do you have a reason not to use an existing service or solution?

If you just want to monitor page hits, set up Google Analytics or a similar service on your site, and they'll do a better job of filtering out the noise than a hand-rolled solution possibly could.