This is the full string I'm trying to parse with regular expressions:
example.com/newsite.com.html?var=/newsite.com&var=newsite.com
I would like to be able to match newsite.com
part of the string but only if it does not appear after the ?
symbol.
Currently, I've only gotten so far:
/newsite.com/g
Which selects all the instances of newsite.com
instead of just the first one.
Link to the regexp playground http://regexr.com/3fmre
EDIT:
Here *
represents everything I would like to ignore, essentially matching only the first occurrence of newsite.com
:
example.com/newsite.com.html?****************************
Here are two ways of doing it..
You could use a
RewriteCond
and test theREQUEST_URI
only.QUERY_STRING
is not part ofREQUEST_URI
so, something like:This is the solution I came up with, it's not bullet-proof but works for the most part:
It ignores anything that starts with
=
and doesn't have a/
in the beginning.Note that this only works with regex implementations that support "positive lookbehind".
You can use
[^?]
which is an exclusion group, it matches every symbol except the ones specified in it, in this case a?
. The expression/^[^?]*/
for instance will match everything from the start until it finds a?
(which wont be part of match).If you want it to match only beginning from
newsite.com
you can use/newsite\.com[^?]*/
, or from?
to the end you can use/[^?]*$/
.Since you tagged
mod-rewrite
you also have the option to use%{QUERY_STRING}
as condition. Query String is basically how we call what's after the?
in the full URL.Using
RewriteCond %{QUERY_STRING} newsite.com
for instance will imply that theRewriteRule
following this condition only applies ifnewsite.com
is found on the Query String.