Word-boundary problem (\ b)

I have an array of keywords and I want to know if at least one of the keywords was found in some string that was posted. I also want to be absolutely sure that it is the keyword that was matched and not something that looks a lot like the word.

Say, for example, what our keywords are [English, Eng, En]

because we are looking for some variations of the English language.

Now say that the user input i h8 eng class

or something equally provocative and illiterate - then it should be matched eng

. It also must not match a type word england

or some odd thing chen

, even though it got a bit en


So, in my endless lack of wisdom, I figured I could do something along these lines to match one of my array elements using input:



Given that the regex will search for matches from an array is now represented as (English|Eng|En)

, then see if there were zero-width word boundaries on both sides.


source to share

4 answers

You need a double backslash.

When you create a regex with a constructor RegExp()

, you are passing in a string. The string string syntax also treats backslashes as a metacharacter, for quoting quotes, etc. This way the backslash will be effectively removed before the code RegExp()

even runs

When you double them, the string parsing step will leave behind a backslash. Then the parser RegExp()

will see a single backslash before "b" and do it right.



You need to double the backslash in your JavaScript string, or encode the Backspace character:





You need to get away from twice \b

because it has special meaning on the lines:






is an escape sequence within string literals (see table 2.1 on this page ). You should avoid this by adding one extra slash:



You don't need to hide \b

when using inside a regex literal:





All Articles