Regexp that ignores everything after a question ma

2019-09-12 22:41发布

问题:

This is the full string I'm trying to parse with regular expressions:

example.com/newsite.com.html?var=/newsite.com&var=newsite.com

I would like to be able to match newsite.com part of the string but only if it does not appear after the ? symbol.

Currently, I've only gotten so far:

/newsite.com/g

Which selects all the instances of newsite.com instead of just the first one.

Link to the regexp playground http://regexr.com/3fmre

EDIT:

Here * represents everything I would like to ignore, essentially matching only the first occurrence of newsite.com :

example.com/newsite.com.html?****************************

回答1:

You could use a RewriteCond and test the REQUEST_URI only. QUERY_STRING is not part of REQUEST_URI so, something like:

RewriteCond %{REQUEST_URI} newsite\.com
RewriteRule your rules if the cond if matched


回答2:

You can use [^?] which is an exclusion group, it matches every symbol except the ones specified in it, in this case a ?. The expression /^[^?]*/ for instance will match everything from the start until it finds a ? (which wont be part of match).

If you want it to match only beginning from newsite.com you can use /newsite\.com[^?]*/, or from ? to the end you can use /[^?]*$/.

Since you tagged mod-rewrite you also have the option to use %{QUERY_STRING} as condition. Query String is basically how we call what's after the ? in the full URL.

Using RewriteCond %{QUERY_STRING} newsite.com for instance will imply that the RewriteRule following this condition only applies if newsite.com is found on the Query String.



回答3:

This is the solution I came up with, it's not bullet-proof but works for the most part:

(?<=[^=]\/)newsite\.com

It ignores anything that starts with = and doesn't have a / in the beginning.

Note that this only works with regex implementations that support "positive lookbehind".



回答4:

Here are two ways of doing it..

        var str = "example.com/newsite.com.html?var=/newsite.com&var=newsite.com"; 
 //1. Look for ? and get the substring from 0 to the foundAt value
        var foundAt = str.indexOf('?');
            document.getElementById("substr").innerHTML = str.substr(0,foundAt);
            
//2. Using regex find the location of occurence of newsite.com and ignore rest          
 var str = "example.com/newsite.com.html?var=/newsite.com&var=newsite.com"; 
    var loc = str.search(/newsite\.com[^?]*/);
    document.getElementById("substr").innerHTML = loc;
    <p id="substr"></p>