So this might be a long, long shot, yet I am completely stumped on what might be causing this issue:
I am delivering a client side JavaScript, that parses certain parameters on the page where it is embedded, uses these parameters to construct a URL and inject an iframe using that URL into the page like:
var queryParams = {
param: 'foo'
, other: 'bar'
};
is turned into:
<iframe src="http://example.net/iframes/123?param=foo&other=bar"></iframe>
This is working quite fine, I am delivering around 1.5 million requests per day. Yet I recently noticed that in around 3.000 cases per day the values of the query parameters are shuffled, so sth like this gets requested:
<iframe src="http://example.net/iframes/123?param=ofo&other=rba"></iframe>
Judging from the logs this is tied to specific users, and the jumbling of characters will happen anew on each request, so I can see sequences like this when a user is browsing the site with multiple pages using the script:
108.161.183.122 - - [14/Sep/2015:15:18:51 +0000] "GET /iframe/ogequl093iwsfr8n?param=3a1bc2 HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"
108.161.183.122 - - [14/Sep/2015:15:19:07 +0000] "GET /iframe/ogequl093iwsfr8n?param=a21b3c HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=2" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"
108.161.183.122 - - [14/Sep/2015:15:19:29 +0000] "GET /iframe/ogequl093iwsfr8n?param=ba132c HTTP/1.0" 401 11601 "http://www.example.net/gallery?page=3" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0"
The 401 is happening on purpose as the server expects param=abc123
.
I also noticed that the majority of errors is happening in Firefox and Safari, not a single erroneous URL has been requested by Google Chrome.
The library I am using for turning the object into a query string is: query-string - but looking at the source code I cannot see any potential for a bug of that kind in there, there's nothing that is done to the values which is not done to the keys (which are not messed up).
Has anyone ever encountered anything similar? Is this some weird browser extension? Is this a collision of my script with another library extending prototypes? Is this malware? Is this something I am completely unaware of? I'd be thankful for any hint because I am really clueless and this is really driving me crazy.
EDIT: I just discovered that another of our public facing services is currently being probed by sth called "Burp Suite". Having a look at their website I see they have a tool called "Payload fuzzing" which seems to do pretty much what is described here: https://portswigger.net/burp/help/intruder_gettingstarted.html or here: https://portswigger.net/burp/help/intruder_using.html#uses_enumerating - The whole tool smells semi-fishy to me, so I this might be something worth investigating further. Has anyone else ever heard of this toolset?
As I already mentioned here Google Analytics Event Permutation
there is a specific version (at least 1.0.37) of the Firefox add-on "Cliqz" having an anti-tracking-functionality built in.
Not much to analyze from this point, and since you're looking for hints; this is more like a long comment rather than an answer.
A malware on the client browser (or machine) or on your web-server; or an unknown crawler could be causing this, which is unlikely. To me, it seems your application is being attacked.
Let's see;
- The real example (in the comments), shows that 128-bit hexadecimal access keys are being shuffled. (values of
accessKey
param)
- Only values get shuffled and not keys.
- You say, requests are coming from specific users.
- You say, requests are coming from specific browser clients (Firefox and Safari).
What to check/do;
- Check if your logging system works properly. If you're using a third-party, configurable logger, this could mess things up. (example)
- Reproduce: Take the same exact set of parameters; use the same version of browser(s) and see if the results are the same. If so, it could be a browser-version issue, which is highly unlikely.
- Check if there are other Firefox and Safari users (with same versions) that do NOT experience this.
- Since you say it's only a small percentage of the requests, check if corresponding requests are made right after another. (Same kind requests in less than a second?)
- Try tracing the source of the requests. Are they coming from a source you suspect? Can you relate information from different requests to each other? Multiple IPs form a subnet? Same IP using different accounts? Same account using different IPs in a short period of time?
- There are tools such as apache-scalp, mod_sec, lorg to check/analyze big log files to extract possible attacks.
- You can also use some of the techniques mentioned here to manually spot or block suspicious requests.
I am Tomas and I am a Software Engineer at CLIQZ.
We are a German Startup who are integrating search and innovative privacy features into browsers. This is indeed a result of our Anti Tracking feature. A similar question was also asked on reddit and in another question on stackoverflow. It was already answered in both posts, so I will just quote the same answer here:
CLIQZ Anti Tracking is not designed to block tracking in general, but rather only the tracking of individual users — which we consider a violation of our users’ privacy, and therefore inappropriate. Unlike other anti-tracking systems, ours doesn’t block the signals completely; thus, website owners are able to get data for legitimate uses, such as counting visits.
To prevent the identification of users (e.g. by using JavaScript hashes), CLIQZ Anti Tracking does in fact permute strings.
. Whenever a new tracker shows up in our data, our system initially treats it as a user-identifying tracker and changes the string to preventively protect our users. Our system uses so called k-anonymity techniques. If it sees the same string for an event with multiple users showing up independently over the course of several days, it puts it on a whitelist of legitimate, non-identifying trackers. Once a tracker is whitelisted, it remains unmodified and website-owners see the original string. In other words, CLIQZ Anti Tracking limits the functionality of legitimate trackers only temporarily. As soon as it becomes clear that a tracker doesn’t violate our user’s privacy, everything works as usual. Privacy is extremely important to us and we believe this technology is necessary to protect our users from snooping.
I hope this helps.
It seems highly unlikely to me that this behaviour has roots in either your or the query-string code. Given that query string values can be freely altered, I suspect this is what is occurring - bare in mind that this is 0.2% of your requests.
There are a couple of things I would check. Are you aware of whether these requests are referred from other websites, your own website, or made directly? Are you aware of whether any of the source IPs correspond to known bots or web crawlers? Are the requests from a variety of sources or a small subset of repeated visitors?
It is possible that a bot or web crawler is "lightly probing your site" or testing for duplicate pages or misleading parameters.
Some robot crawls your site, it is quite normal. If you don't want him to load your server, block the request IP.