Recording user data for heatmap with JavaScript

2019-01-21 10:52发布

问题:

I was wondering how sites such as crazyegg.com store user click data during a session. Obviously there is some underlying script which is storing each clicks data, but how is that data then populated into a database? It seems to me the simple solution would be to send data via AJAX but when you consider that it's almost impossible to get a cross browser page unload function setup, I'm wondering if there is perhaps some other more advanced way of getting metric data.

I even saw a site which records each mouse movement and I am guessing they are definitely not sending that data to a database on each mouse move event.

So, in a nutshell, what kind of technology would I need in order to monitor user activity on my site and then store this information in order to create metric data? I am not looking to recreate GA, I'm just very interested to know how this sort of thing is done.

Thanks in advance

回答1:

The fundamental idea used by many tracking systems uses a 1x1px image which is requested with extra GET parameters. The request is added to server log file, then log files are processed to generate some statistics. So a minimalist click tracking function might look like this:

document.onclick = function(e){
  var trackImg = new Image();
  trackImg.src = 'http://tracking.server/img.gif?x='+e.clientX+'&y='+e.clientY;
}

AJAX wouldn't be useful because it is subject to same-origin policy (you won't be able to send requests to your tracking server). And you'd have to add AJAX code to your tracking script. If you want to send more data (like cursor movements) you'd store the coordinates in a variable and periodically poll for a new image with updated path in the GET parameter.

Now there are many many problems:

  • cross-browser compatibility - to make the above function work in all browsers that matter at the moment you'd probably have to add 20 more lines of code
  • getting useful data
    • many pages are fixed-width, centered, so raw X and Y coordinates won't let you create visual overlay of clicks n the page
    • some pages have liquid-width elements, or use a combination of min- and max-height
    • users may use different font sizes
    • dynamic elements that appear on the page in response to user's actions
  • etc. etc.

When you have the tracking script worked out you only need to create a tool that takes raw server logs and turns them into shiny heatmaps :)



回答2:

Heatmap analytics turns out to be WAY more complicated than just capturing the cursor coordinates. Some websites are right-aligned, some are left-aligned, some are 100%-width, some are fixed-width-"centered"... A page element can be positioned absolutely or relatively, floated etc. Oh, and there's also different screen resolutions and even multi-monitor configurations.

Here's how it works in HeatTest (I'm one of the founders, have to reveal that due to the rules):

  1. JavaScript handles the onClick event: document.onclick = function(e){ } (this will not work with <a> and <input> elements, have to hack your way around)
  2. Script records the XPath-address of the clicked element (since coordinates are not reliable, see above) in a form //body/div[3]/button[id=search] and the coordinates within the element.
  3. Script sends a JSONP request to the server (JSONP is used because of the cross-domain limitations in browsers)
  4. Server records this data into the database.

Now, the interesting part - the server.

  1. To calculate the heatmap the server launches a virtual instance of a browser in-memory (we use Chromium and IE9)
  2. Renders the page
  3. Takes a screenshot,
  4. Finds the elements' coordinates and then builds the heatmap.

It takes a lot of cpu-power and memory usage. A lot. So most of the heatmap-services including both us and CrazyEgg, have stacks of virtual machines and cloud servers for this task.



回答3:

Don't know the exact implementation details of how crazyegg does it, but the way I would do it is to store mouse events in an array which I'd send periodically over AJAX to the backend – e.g. the captured mouse events are collected and sent every 30 seconds to the server. This recudes the strain of creating a request for every event, but it also ensures that I will only lose 30 seconds of data at maximum. You can also add the sending to the unload event which increases the amount of data you get, but you wouldn't be dependent on it.

Some example on how I'd implement it (using jQuery as my vanilla JS skills are a bit rusty):

$(function() {

    var clicks = [];

    // Capture every click
    $().click(function(e) {
        clicks.push(e.pageX+','+e.pageY);
    });

    // Function to send clicks to server
    var sendClicks = function() {
        // Clicks will be in format 'x1,y1;x2,y2;x3,y3...'
        var clicksToSend = clicks.join(';');
        clicks = [];
        $.ajax({
            url: 'handler.php',
            type: 'POST',
            data: {
                clicks: clicksToSend
            }
        });
    }

    // Send clicks every 30 seconds and on page leave
    setInterval(sendClicks, 30000);
    $(window).unload(sendClicks);
});

Note that I haven't tested or tried this in any way but this should give you a general idea.



回答4:

I really don't see why do you think that is impossible to store all click points in one user session to the database?

Their moto is "See Where People Click" Once when you gather enough data it is fairly easy to make heat maps in batch processes.

People are really underestimating databases, indexing and sharding. The only hard thing here is to gather enough money for underlying architecture :)



回答5:

If you're just looking for interaction, you could replace your <input type="button"> with <input type="image">. These are automatically submitted with X,Y coordinates of where the user has clicked.

jQuery also has a good implementation of the mousemove even binding that can track the current mouse position. I don't know your desired end result, but you could setTimeOut(submitMousePosition, 1000) to send an ajax call with the mouse position every second or something like that.