A quick way to map an array of words to a block of text?

The subject is probably not as clear as it could be, but I struggled to think of the best way to describe it easily.

I am using a flag filter for some of the articles we get from an XML feed. At the moment I have icons in an array and just check the text like so:

str_replace($badwords, '', $text, $count); 
if ($count > 0) // We have bad words... 

      

But this is SLOW! So slow! And when I try to process 30,000 articles at a time, I start to wonder if there is a better way to achieve this. If only the arrays supported by strpos! Even then, I don't think it will be faster ...

I like any suggestions. Thanks in advance!

EDIT:

Now I have tested several methods between microtime () calls and their time. str_replace () = 990 seconds preg_match () = 1029 seconds (remember I only need to identify them, not replace them) no bad word filtering = 1057 seconds (presumably because it has another thousand or so misspelled articles for processing.

Thanks for all the answers, I just continue with str_replace. :)

+2


source to share


4 answers


How can I concatenate all words in a regex to replace all in one go? I'm not sure how it will go for performance, but it might be faster.

eg.



preg_replace('/(' . implode('|', $badwords) . ')/i', '', $text);

      

+2


source


I worked in a local office. instead of modifying the text to remove bubbles from the source files, I just ran the filter when the user asked to view the article. this way you keep the source code if you ever need it, but you can also release a clean version for your viewers. there should be no need to process 30,000 articles at once if I don't understand something.



+2


source


Define "slow"? Anything going to process 30,000 articles will probably take a little time.

However, one option (which I haven't tested, just throwing it there for consideration) was to concatenate the words into a regex and run it through preg_replace (just using an operator |

to concatenate them).

+1


source


+1


source







All Articles