My goal is to "whitelist" certain querystring attributes and their values so varnish will not vary cache between the urls.
Example:
Url 1: http://foo.com/someproduct.html?utm_code=google&type=hello
Url 2: http://foo.com/someproduct.html?utm_code=yahoo&type=hello
Url 3: http://foo.com/someproduct.html?utm_code=yahoo&type=goodbye
In the above example I want to whitelist "utm_code" but not "type" So after the first url is hit I want varnish to serve that cached content to the second url.
However, in the case of the third url, the attribute "type" value is different so that should be a varnish cache miss.
I have tried the 2 methods below (found on a drupal help article I can't locate right now) that did not seem to work. Might be because I have the regex wrong.
# 1. strip out certain querystring values so varnish does not vary cache.
set req.url = regsuball(req.url, "([\?|&])utm_(campaign|content|medium|source|term)=[^&\s]*&?", "\1");
# get rid of trailing & or ?
set req.url = regsuball(req.url, "[\?|&]+$", "");
# 2. strip out certain querystring values so varnish does not vary cache.
set req.url = regsuball(req.url, "([\?|&])utm_campaign=[^&\s]*&?", "\1");
set req.url = regsuball(req.url, "([\?|&])foo_bar=[^&\s]*&?", "\1");
set req.url = regsuball(req.url, "([\?|&])bar_baz=[^&\s]*&?", "\1");
# get rid of trailing & or ?
set req.url = regsuball(req.url, "[\?|&]+$", "");
You want to strip out
utm_code
but it's not covered by either of the regexps you are using.Try this:
Or if you want to strip all URL parameters that start with
utm_
you can go with:From https://github.com/mattiasgeniar/varnish-4.0-configuration-templates:
I figured this out and wanted to share. I found this code that makes a subroutine that does what I need.
A copy of runamok but i got + instead of %20 in my params so i have added that to my regex
There's something wrong with the RegEx.
I changed the RegExes used in both regsub calls:
The first change is the part "[%._A-z0-9-]", because the dash functioned like a range symbol, that's why I've moved it to the end, and the dot should be escaped.
The second change is to not only remove a question mark at the remaining URL, but also an ampersand or question mark and ampersand.
Have you guys given this a try? https://github.com/Dridi/libvmod-querystring
Example
set req.url = querystring.regfilter(req.url, "utm_.*");