The effect of regex between `(a | e | i | o | u)? `and` ([aeiou])? `

This means both (a|e|i|o|u)?

and [aeiou]?

have the same effect, I wonder if there is a significant difference in performance.

+3


source to share


2 answers


the example you give has a significant advantage in using class over interleaving.

For example:

string 'abcde'
regex1 /(a|e|i|o|u)cde/
regex2 /[aeiou]cde/

      

Applying either regex1 or regex2 to that line will fail, but what's going on under the hood?



Regex1 takes the first character of the string and sees if it matches the regex, so it checks "a" against (a | e | i | o | u) which matches, however the regex engine also notes that there are 4 more alternations which can be tested if this fails later. Then it takes the second character of the string and matches it against the second atomic group of the regex 'c'. This causes the re-expression to fail, however it still has 4 more states that it can use to try and make a match so the regex engine will go back one step and try to match the first character of the string against eiou before deciding the regex will complete and exit.

Regex 2, on the other hand, decides that the first character of the string 'a' is one of the characters in the [aeiou] class, no additional states are created, and therefore, when the second character does not match, it exits with an error, much faster than regex1 ...

There is a lot more to how regex internals work, as there are two types of engine (deterministic and non-deterministic), but if you're interested in reading more regular-expressions.info has a very good detailed description of what is going on.

+4


source


The above will be the same, but the difference is that [AEIOU] will run the character class. Instead of writing (1 | 2 | 3 | 4 | 5), you could just write [1-5] and it will be interpreted as the same thing. Using a different method, you will need to re-declare whatever you would like to match each time.



You can read more here http://www.regular-expressions.info/charclass.html

0


source







All Articles