Preg_replace keywords OUTSIDE of <strong> tags

I cannot tell you how many hours I spent on this. I just want to IGNORE any keyword instances that BETWEEN strong tags. Whether they are directly next to the tags or somewhere in between with other words. All while keeping keywords case insensitive.

Example:

The man drove in his car. Then <strong>the man walked to the boat.</strong> 

      

The word boat

should be ignored and Car

should be replaced.

$keywords = array(
'boat',
'car',
);

$p = implode('|', array_map('preg_quote', $keywords));

$string = preg_replace("/\b($p)\b/i", 'gokart', $string, 4);

      

+3


source to share


2 answers


You can use SKIP-FAIL regex to replace only what explicitly ends up on non-identical delimiters:

<strong>.*?<\/strong>(*SKIP)(*FAIL)|\b(boat|car)\b

      

Watch the demo



Here is the IDEONE daemon :

$str = "The man drove in his car.Then <strong>the man walked to the boat.</strong>"; 
$keywords = array('boat','car');
$p = implode('|', array_map('preg_quote', $keywords));
$result = preg_replace("#<strong>.*?<\/strong>(*SKIP)(*FAIL)|\b($p)\b#i", "gokart", $str);
echo $result;

      

NOTE that in this case we are most likely not interested in solving the tempered greedy token inside the SKIP-FAIL block (which I posted originally, see the changelog), since we don't care what is in between the delimiters.

+4


source


Regex is probably not the best way to do something like this.

It's probably best to use a DOM parser or something similar to find the tags correctly <strong>

.



Several of the answers here offer some useful options: RegEx: Matching text that is not inside and part of an HTML tag

0


source







All Articles