Comparing user-specific URL list with current URL?

2019-08-25 05:01发布

I have a whitelist where users can enter specific URLs/URL patterns (only targetting http and https.

I would like to transfrom and compare these URL/URL patterns so that the wildcard selector (*) can be used like so

user enters: example.*/test

I want to transform this to: *//*.example.*/test

so that it matches: http://www.example.com/test, https://example.co.uk/test

Another example:

user enters: http://www.*.com/*

I want to transform this to: http://www.*.com/*

so that it matches: http://www.blah.com/test, http://www.other.com/null.html

and

user enters: www.example.com/*

I want to transform this to: *//www.example.com/*

so that it matches: http://www.example.com/testtwo, https://www.example.com/arfg

The reason I want to insert a leading protocol (if it wasn't included by the user) is because I am using this to compare against the current tab URL.

I get this array of URL strings and would like to compare them with the current url, but am having trouble matching all use cases:

 "isNotWhitelisted" : function(){
      var whitelist = MyObject.userLists.whitelist;
      var currentUrl = document.location.href;
      for(var i=0; i<whitelist.length; i++){
          var regexListItem = new RegExp(whitelist[i].toString().replace(".", "\\.").replace("*", ".+"));
          if(currentUrl.match(regexListItem)) {
              return false;
          }
      }
      return true;
  },
  1. Firstly, the regex conversion matches end cases (e.g. example.com/* but not kinds like example.*/about

  2. This is part of a Chrome extension, is there not a better/easier way to do this maybe using inbuilt methods?

Thanks for any help in advance.

3条回答
Animai°情兽
2楼-- · 2019-08-25 05:36

Hm, m.b. create RegExp from whitelist items? If it works as you expected:

new RegExp('example.com/*').test('http://example.com/aaaa')

Just create regexp from each item in whitelist

whitelist.forEach(function(item) {
  new RegExp(item).match(URL);
});

查看更多
Rolldiameter
3楼-- · 2019-08-25 05:43
whitelist.forEach(function(listItem){
     var rgx = new RegExp(listItem.replace(/\./g,'\\.').replace(/\*/g,'.*'));
     if(rgx.test(url)) {
       // current URL matches URL/URL pattern in whitelist array! 
     }  
  })

If you dont replace, the pattern 'www.*.com' match also to 'wwwocom'.

If you want use another special characters you can use this:

var rgx = new RegExp(listItem.replace(/(\.|\[|\]|\{|\}|\(|\)|\+|\?|\\|\$|\^)/g,'\\$1').replace(/\*/g,'.*'));
查看更多
兄弟一词,经得起流年.
4楼-- · 2019-08-25 05:43

If you want a greedy matching, I think you need request the user enter the pattern in this format: *://*/*

You can check this in this way:

var special_char_rgx = /(\.|\[|\]|\{|\}|\(|\)|\+|\?|\\|\/|\$|\^|\|)/g; // I think that all...
var asterisk_rgx = /\*/g;
var pattern_rgx = /^([^:\/]+):\/\/([^\/]+)\/(.*)$/g;

function addPatern(pattern, whitelist) {
    var str_pattern = pattern.replace(asterisk_rgx,'\\*');
    var isMatch = pattern_rgx.test(str_pattern);
    if (isMatch) {
        pattern = pattern.replace(special_char_rgx,'\\$1').replace(asterisk_rgx, '.+');
        whitelist.push(new RegExp('^'+pattern + '$'));
    }

    pattern_rgx.lastIndex = 0; // Otherwise RegExp.test save this value and destroy the tests!

    return isMatch;
}

If you want handle the protocol/ domain/ path in different ways you can do it that way:

    if (isMatch) {
        var protocol = RegExp.$1;
        var domain= RegExp.$2;
        var path_query = RegExp.$3;            
        // Your logic...
    }
查看更多
登录 后发表回答