I was wondering how sites such as crazyegg.com store user click data during a session. Obviously there is some underlying script which is storing each clicks data, but how is that data then populated into a database? It seems to me the simple solution would be to send data via AJAX but when you consider that it's almost impossible to get a cross browser page unload function setup, I'm wondering if there is perhaps some other more advanced way of getting metric data.
I even saw a site which records each mouse movement and I am guessing they are definitely not sending that data to a database on each mouse move event.
So, in a nutshell, what kind of technology would I need in order to monitor user activity on my site and then store this information in order to create metric data? I am not looking to recreate GA, I'm just very interested to know how this sort of thing is done.
Thanks in advance
Don't know the exact implementation details of how crazyegg does it, but the way I would do it is to store mouse events in an array which I'd send periodically over AJAX to the backend – e.g. the captured mouse events are collected and sent every 30 seconds to the server. This recudes the strain of creating a request for every event, but it also ensures that I will only lose 30 seconds of data at maximum. You can also add the sending to the unload event which increases the amount of data you get, but you wouldn't be dependent on it.
Some example on how I'd implement it (using jQuery as my vanilla JS skills are a bit rusty):
Note that I haven't tested or tried this in any way but this should give you a general idea.
If you're just looking for interaction, you could replace your
<input type="button">
with<input type="image">
. These are automatically submitted with X,Y coordinates of where the user has clicked.jQuery also has a good implementation of the mousemove even binding that can track the current mouse position. I don't know your desired end result, but you could setTimeOut(submitMousePosition, 1000) to send an ajax call with the mouse position every second or something like that.
The fundamental idea used by many tracking systems uses a 1x1px image which is requested with extra GET parameters. The request is added to server log file, then log files are processed to generate some statistics. So a minimalist click tracking function might look like this:
AJAX wouldn't be useful because it is subject to same-origin policy (you won't be able to send requests to your tracking server). And you'd have to add AJAX code to your tracking script. If you want to send more data (like cursor movements) you'd store the coordinates in a variable and periodically poll for a new image with updated path in the GET parameter.
Now there are many many problems:
When you have the tracking script worked out you only need to create a tool that takes raw server logs and turns them into shiny heatmaps :)
I really don't see why do you think that is impossible to store all click points in one user session to the database?
Their moto is "See Where People Click" Once when you gather enough data it is fairly easy to make heat maps in batch processes.
People are really underestimating databases, indexing and sharding. The only hard thing here is to gather enough money for underlying architecture :)
Heatmap analytics turns out to be WAY more complicated than just capturing the cursor coordinates. Some websites are right-aligned, some are left-aligned, some are 100%-width, some are fixed-width-"centered"... A page element can be positioned absolutely or relatively, floated etc. Oh, and there's also different screen resolutions and even multi-monitor configurations.
Here's how it works in HeatTest (I'm one of the founders, have to reveal that due to the rules):
document.onclick = function(e){ }
(this will not work with<a>
and<input>
elements, have to hack your way around)//body/div[3]/button[id=search]
and the coordinates within the element.Now, the interesting part - the server.
It takes a lot of cpu-power and memory usage. A lot. So most of the heatmap-services including both us and CrazyEgg, have stacks of virtual machines and cloud servers for this task.