Get duplicate characters in string

2020-02-16 02:33发布

I try to match/get all repetitions in a string. This is what I've done so far:

var str = 'abcabc123123';
var REPEATED_CHARS_REGEX = /(.).*\1/gi;

console.log( str.match(REPEATED_CHARS_REGEX) ); // => ['abca', '1231']

As you can see the matching result is ['abca', '1231'], but I excpect to get ['abc', '123']. Any ideas to accomplish that?

2nd question:

Another thing I excpect, is to make it possible to change the duration how often a char needs to be in the string to get matched...

For example if the string is abcabcabc and the repetation-time is set to 2 it should result in ['abcabc']. If set to 3 it should be ['abc'].

Update

A non-RegExp solution is perfectly alright!

3条回答
够拽才男人
2楼-- · 2020-02-16 03:15

The answer above returns more duplicates than there actually are. The second for loop causes the problem and is unnecessary. Try this:

function stringParse(string){
  var arr = string.split("");
  for(var i = 0; i<arr.length; i++){
    var letterToCompare = arr[i];
    var j= i+1;
    if(letterToCompare === arr[j]){
      console.log('duplicate found');
      console.log(letterToCompare);
    }    
  }  
}
查看更多
Evening l夕情丶
3楼-- · 2020-02-16 03:26

This solution may be used if you don't want to use regex:

function test() {
    var stringToTest = 'find the first duplicate character in the string';
    var a = stringToTest.split('');
    for (var i=0; i<a.length; i++) {
        var letterToCompare = a[i];
        for (var j=i+1; j<a.length; j++) {
            if (letterToCompare == a[j]) {
                console.log('first Duplicate found');
                console.log(letterToCompare);
                return false;
            }
        }
    }
}
test()
查看更多
Deceive 欺骗
4楼-- · 2020-02-16 03:35

Well, I think falsetru had a good idea with a zero-width look-ahead.

'abcabc123123'.match(/(.+)(?=\1)/g)
// ["abc", "123"]

This allows it to match just the initial substring while ensuring at least 1 repetition follows.

For M42's follow-up example, it could be modified with a .*? to allow for gaps between repetitions.

'abc123ab12'.match(/(.+)(?=.*?\1)/g)
// ["ab", "12"]

Then, to find where the repetition starts with multiple uses together, a quantifier ({n}) can be added for the capture group:

'abcabc1234abc'.match(/(.+){2}(?=.*?\1)/g)
// ["abcabc"]

Or, to match just the initial with a number of repetitions following, add the quantifier within the look-ahead.

'abc123ab12ab'.match(/(.+)(?=(.*?\1){2})/g)
// ["ab"]

It can also match a minimum number of repetitions with a range quantifier without a max -- {2,}

'abcd1234ab12cd34bcd234'.match(/(.+)(?=(.*?\1){2,})/g)
// ["b", "cd", "2", "34"]
查看更多
登录 后发表回答