Sanitize Markdown by XSS

I am using Markdown to provide an easy way to write messages to my users in my forum script.
I'm trying to misinform all custom inputs, but I'm having a problem with Markdown inputs.

I need to store the markup text in the database, not the HTML converted version, since users are allowed to edit their posts.

Basically I need something like what StackOverflow does.

I read this article about Markdown XSS vulnerability. And the only solution I have found is to use HTML_purifier before each my script exits.

I think this might slow down my script, I am presenting outputting from 20 posts and running HTML_purifier for each ...

So, I was trying to find a sanitizing solution from XSS vulnerabilities sanitizing input instead of output.

I can't run HTML_purifier on input because my text is Markdown and not HTML. And if I convert it to get HTML, I cannot convert it back to Markdown.

I already remove (hopefully) all the HTML with:

htmlspecialchars(strip_tags($text));

      

I thought of another solution:

When the user tries to send a new message: Convert the Markdown input file to HTML, run HTML_purifier and if he finds an XSS injection, it just returns an error. But I don't know how to do this, and I don't know if HTML_purifier allows it.

I found many questions about the same problem, but all the solutions were to store the input as HTML. I need to store it as Markdown.

Anyone have any advice?

+3


source to share


3 answers


  • Run Markdown on login
  • Run the HTML to HTML cleaner generated by Markdown. Configure it to allow links, href attributes, etc. (He should still erase commands javascript:

    ).



// the nasty stuff :)
$content = "> hello <a name=\"n\" \n href=\"javascript:alert('xss')\">*you*</a>";

require '/path/to/markdown.php';

// at this point, the generated HTML is vulnerable to XSS
$content = Markdown($content);

require '/path/to//HTMLPurifier/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();
$config->set('Core.Encoding', 'UTF-8');
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('Cache.DefinitionImpl', null);

// put here every tag and attribute that you want to pass through
$config->set('HTML.Allowed', 'a[href|title],blockquote[cite]');

$purifier = new HTMLPurifier($config);

// here, the javascript command is stripped off
$content = $purifier->purify($content);

print $content;

      

+7


source


Resolved ...

$text = "> hello <a name=\"n\"
> href=\"javascript:alert('xss')\">*you*</a>";


$text = strip_tags($text);

$text = Markdown($text);

echo $text;

      

It returns:

<blockquote>
  <p>hello  href="javascript:alert('xss')"&gt;<em>you</em></p>
</blockquote>

      

And not:



<blockquote>
  <p>hello <a name="n" href="javascript:alert('xss')"><em>you</em></a></p>
</blockquote>

      

Seems to strip_tags()

work.

Merge:

$text = preg_replace('/href=(\"|)javascript:/', "", $text);

      

All contributions should be cleaned of XSS injections. Correct me if I am wrong.

0


source


The html output of your markdown only depends on the md parser, so you can

  • convert your md to html and sanitize the html after that as described here:

    Exception to Markdown XSS vulnerability?

  • or you can change your md parser to check every parameter that goes into html attribute for xss signs. Ofc you should hide behind html tags before parsing. I think this solution is much faster than the other, because on plain texts you should usually only check URLs with images and links.
0


source







All Articles