Preg_Replace not working on french character - PHP

I've been looking for a while, so hopefully this isn't a question that's been asked many times already.

I am trying to execute a php program a script that will remove stop words from a string and then explode into an array of words. Stop words can be in English or French.

Currently, the following doesn't work for me, since it doesn't remove French characters:

$needles=array(
'/\bil\b/i', 
'/\bla\b/i', 
'/\ble\b/i', 
'/\b'. htmlentities('à') .'\b/i'
);
print_r($needles);

$result=preg_replace($needles, "", htmlentities("il y à trois personne dans la salle à manger"));
print_r($result);

      

Exit removes everything but not the French character: à

+3


source to share


1 answer


As noted in the comments, htmlentities('à')

will provide you with [3] => /\bà\b/i

. It will not match your letter .

Use the à

c flag instead u

to include Unicode in the template:

'/\bà\b/iu'

      

Watch the demo



demo IDEONE :

<?php
$needles=array(
'/\bil\b/i', 
'/\bla\b/i', 
'/\ble\b/i', 
'/\bà\b/iu'
);
print_r($needles);

$result=preg_replace($needles, "", "il y à trois personne dans la salle à manger");
print_r($result);

      

Output:

y  trois personne dans  salle  manger

      

0


source







All Articles