How do I remove certain tags and certain attributes from a string?

Here's a deal, I'm doing a project to help teach HTML to people. Naturally, I am afraid of this Skumag Steve (see Figure 1).

So, I wanted to block HTML tags ALL , except those that have been approved in a very white white list ..

From those approved HTML tags, I want to remove the harmful attributes . Such as onload

and onmouseover

. Also, according to the white list .

I was thinking about regex, but I'm pretty sure it's evil and not very useful to work with.

Can anyone give me a nudge in the right direction?

Thanks in advance.


Figure: 1.

Scumbag steve

+3


source to share


3 answers


require_once 'library/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();

 // this one is needed cause otherwise stuff 
 // considered harmful like input will automatically be deleted
$config->set('HTML.Trusted', true);

// this line say that only input, p, div will be accepted
$config->set('HTML.AllowedElements', 'input,p,div');

// set attributes for each tag
$config->set('HTML.AllowedAttributes', 'input.type,input.name,p.id,div.style');

// more extensive way of manage attribute and elements... see the docs
// http://htmlpurifier.org/live/configdoc/plain.html
$def = $config->getHTMLDefinition(true);

$def->addAttribute('input', 'type', 'Enum#text');
$def->addAttribute('input', 'name', 'Text');

// call...
$purifier = new HTMLPurifier($config);

// display...
$html = $purifier->purify($raw_html);

      



  • NOTE: since you requested this code to run as a whitelist, only p and div inputs are accepted and only certificates are accepted.
+5


source


Use Zend framework 2 strip tags . Example below for accepting ul, li, p ... and img (only with src attribute) and links (only with href atttribute). Everything else will be split. If I am not mistaken, zf1 does the same



     $filter = new \Zend\Filter\StripTags(array(
        'allowTags'   => array(
            'ul'=>array(), 
            'li'=>array(), 
            'p'=>array(), 
            'br'=>array(), 
            'img'=>array('src'), 
            'a'=>array('href')
         ),
        'allowAttribs'  => array(),
        'allowComments' => false)
    );

    $value = $filter->filter($value);

      

+1


source


For tags, you can use strip_tags

For attributes, refer to How to remove attributes from an html tag?

0


source







All Articles