Word-boundary problem (\ b)
I have an array of keywords and I want to know if at least one of the keywords was found in some string that was posted. I also want to be absolutely sure that it is the keyword that was matched and not something that looks a lot like the word.
Say, for example, what our keywords are [English, Eng, En]
      
        
        
        
      
    because we are looking for some variations of the English language.
Now say that the user input i h8 eng class
      
        
        
        
      
    or something equally provocative and illiterate - then it should be matched eng
      
        
        
        
      
    . It also must not match a type word england
      
        
        
        
      
    or some odd thing chen
      
        
        
        
      
    , even though it got a bit en
      
        
        
        
      
    .
So, in my endless lack of wisdom, I figured I could do something along these lines to match one of my array elements using input:
.match(RegExp('\b('+array.join('|')+')\b','i'))
      
        
        
        
      
    Given that the regex will search for matches from an array is now represented as (English|Eng|En)
      
        
        
        
      
    , then see if there were zero-width word boundaries on both sides.
You need a double backslash.
When you create a regex with a constructor RegExp()
      
        
        
        
      
    , you are passing in a string. The string string syntax also treats backslashes as a metacharacter, for quoting quotes, etc. This way the backslash will be effectively removed before the code RegExp()
      
        
        
        
      
    even runs
When you double them, the string parsing step will leave behind a backslash. Then the parser RegExp()
      
        
        
        
      
    will see a single backslash before "b" and do it right.
You need to double the backslash in your JavaScript string, or encode the Backspace character:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
      
        
        
        
      
    You need to get away from twice \b
      
        
        
        
      
    because it has special meaning on the lines:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
      
        
        
        
      
     \b
      
        
        
        
      
    is an escape sequence within string literals (see table 2.1 on this page ). You should avoid this by adding one extra slash:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
      
        
        
        
      
    You don't need to hide \b
      
        
        
        
      
    when using inside a regex literal:
/\b(english|eng|en)\b/i