Why doesn't this function work when I use this regex?

I have been practicing with more comfortable using regular expressions, but I find it very difficult to understand why this function I wrote is not working. I wrote a simple function to count the number of repeated letters in a word, which seems to work sometimes but doesn't work all the time.

function duplicates(str){
    try{
        return str.match(/(.)\1+/ig).length;
    }catch(e){
        return 0;
    }
}

      

As per what I've researched, this statement is supposed to scan the string, find a letter (or multiple letters) that repeat multiple times, ignoring the case, and return the length of the letters that match. If there are no matching matches, it will return 0. It works correctly for some strings, but not all. This is what I get:

duplicates("abcdef") -> 0      #should return 0
duplicates("Aabccdef") -> 2    #should return 2
duplicates("Mississippi") -> 3 #should return 3
duplicates("Indivisible") -> 0 #should return 1
duplicates("abcabcabc") -> 0   #should return 3

      

Upon further checking, it would seem that when I ran Mississippi I got the expected number 3, however when I added .toString () in the .length replacement to see what the expression was counting up, I got:

ss,ss,pp

      

No, I calculated and should be. It would seem that I was not taken into account in the "indivisible" and did not indicate any repeated letters in "abcabcabc". It looks like it can't count endless repetitions, but I can't figure out why. I'm sure this is my misunderstanding of how regular expressions work as I'm new to them, but if someone clarifies why this is happening, that would be awesome!

Edit: Is there a way to do this with RegEx, or do I need to use a loop?

+3


source to share


4 answers


You are getting close to achieving this, but since you are looking for a recently captured character immediately after being captured, you cannot count characters that are not neighbors.

The idea would be to use a positive lookahead to find duplicate characters and then omit the duplicate characters to leave unique characters to count. Regular expression:

(.)(?=.*\1)

      

ES6:



function duplicates($str) {
    return [...new Set($str.toLowerCase().match(/(.)(?=.*\1)/g))].length;
}

console.log(duplicates("abcdef"));
console.log(duplicates("Aabccdef"));
console.log(duplicates("Mississippi"));
console.log(duplicates("Indivisible"));
console.log(duplicates("abcabcabc"));
      

Run codeHide result


ES5:



function _unique(value, index, self) { 
    return self.indexOf(value) === index;
}

function duplicates($str) {
    return ($str.toLowerCase().match(/(.)(?=.*\1)/g) || Array()).filter(_unique).length;
}

console.log(duplicates("abcdef"));
console.log(duplicates("Aabccdef"));
console.log(duplicates("Mississippi"));
console.log(duplicates("Indivisible"));
console.log(duplicates("abcabcabc"));
      

Run codeHide result


0


source


In terms of the actual regex you posted, there are several issues with it. The reason it (.)\1+

doesn't work is because the "first match" ( \1

) immediately follows the matching group with .

. This means that in the case of Mississippi, since there is no consecutive "i", your template does not match them.


As an alternative solution to this problem, you can keep it simple. A smarter solution for your use case would be to just loop through and count each letter.




function duplicates(str){
    try{
        let letters = str.toLowerCase().split('');
        let countedLetters = {}
        for(let i = 0; i < letters.length; i++) {
            countedLetters[letters[i]] = countedLetters[letters[i]] + 1 || 1;
        }
        return countedLetters;
    } catch(e) {
        return 0;
    }
}

console.log(duplicates('Mississippi'));
      

Run codeHide result


+2


source


Your regex accepts any letter, letting you say X from your word and back, referencing the same letter to check if it was the same or not. You can use this renderer to understand what it does. It counts the occurrence of characters immediately followed by the first occurrence.

Link to your request: https://regexper.com/#%2F(.)%5C1%2B%2Fig

Website link: https://regexper.com/

0


source


I haven't researched duplicates in Indivisible.

This is not a complete answer, because the string "abcabcabc" can be split into subpatterns such as "abcabc", which starts with 0 characters, and "abcabc", which starts at 3.

This is not a complete answer, but I hope this is helpful

'olololo'.match(/(.+)(?=(\1))/ig)

      

-1


source







All Articles