The problem with a simple PHP profanation filter

I am writing a simple profanity filter in PHP. Can anyone tell me why in the following code the filter works (it will print [explicit]) for the $ vowels array and not the $ lines array that I am creating from a text file?

 function clean($str){

$handle = fopen("badwords.txt", "r");
if ($handle) {
   while (!feof($handle)) {
       $array[] = fgets($handle, 4096);
   }
   fclose($handle);
}

$vowels = array("a", "e", "i", "o", "u", "A", "E", "I", "O", "U");

$filter = "[explicit]";
$clean = str_replace($array, $filter, $str);
return $clean;
 }

      

When using $ vowels instead of $ array, it works, except for the lowercase vowels that are returned:

 [[expl[explicit]c[explicit]t]xpl[explicit]c[explicit]t]

 instead of 

 [explicit]

      

Not sure why this is happening.

Any ideas?

Thank!

+1


source to share


4 answers


I modified Davethegr8's solution to get the following working example:



 function clean($str){

global $clean_words; 

$replacement = '[explicit]';

if(empty($clean_words)){
    $badwords = explode("\n", file_get_contents('badwords.txt'));

    $clean_words = array();

    foreach($badwords as $word) {
        $clean_words[]= '/(\b' . trim($word) . '\b)/si';
    }
}

$out = preg_replace($clean_words, $replacement, $str);
return $out;
 }

      

+1


source


Make sure you read:

Coding Horror: Filters of Obscenity: Bad Idea or Incredible Interdependence of Bad Idea?



before you decide to continue the string replacement path ...

+2


source


Because the filter output contains lowercase vowels, which are also the characters you are filtering. Namely, you are creating a feedback loop.

+1


source


First, file_get_contents is a much simpler function to read a file into a variable.

$badwords = explode("\n", file_get_contents('badwords.txt');

      

Second, preg_replace offers much more flexible string replacement options. - http://us3.php.net/preg_replace

foreach($badwords as $word) {
    $patterns[] = '/'.$word.'/';
}

$replacement = '[explicit]';

$output = preg_replace($patterns, $replacement, $input);

      

+1


source







All Articles