Why does React.js' API warn against inserting

2019-07-21 15:28发布

问题:

From the tutorial

But there's a problem! Our rendered comments look like this in the browser: "<p>This is <em>another</em> comment</p>". We want those tags to actually render as HTML.

That's React protecting you from an XSS attack. There's a way to get around it but the framework warns you not to use it:

...

<span dangerouslySetInnerHTML={{__html: rawMarkup}} />

This is a special API that intentionally makes it difficult to insert raw HTML, but for Showdown we'll take advantage of this backdoor.

Remember: by using this feature you're relying on Showdown to be secure.

So there exists an API for inserting raw HTML, but the method name and the docs all warn against it. Is it safe to use this? For example, I have a chat app that takes Markdown comments and converts them to HTML strings. The HTML snippets are generated on the server by a Markdown converter. I trust the converter, but I'm not sure if there's any way for a user to carefully craft Markdown to exploit XSS. Is there anything else I should be doing to make sure this is safe?

回答1:

Most Markdown processors (and I believe Showdown as well) allow the writer to use inline HTML. For example a user might enter:

This is _markdown_ with a <marquee>ghost from the past</marquee>. Or even **worse**:
<script>
  alert("spam");
</script>

As such, you should have a whitelist of tags and strip all the other tags after converting from markdown to html. Only then use the aptly named dangerouslySetInnerHTML.

Note that this also what Stackoverflow does. The above Markdown renders as follows (without you getting an alert thrown in your face):

This is markdown with a ghost from the past. Or even worse:

alert("spam");


回答2:

There are three reasons it's best to avoid html:

  1. security risks (xss, etc)
  2. performance
  3. event listeners

The security risks are largely mitigated by markdown, but you still have to decide what you consider valid, and ensure it's disallowed (maybe you don't allow images, for example).

The performance issue is only relevant when something will change in the markup. For example if you generated html with this: "Time: <b>" + new Date() + "</b>". React would normally decide to only update the textContent of the <b/> element, but instead replaces everything, and the browser must reparse the html. In larger chunks of html, this is more of a problem.

If you did want to know when someone clicks a link in the results, you've lost the ability to do so simply. You'd need to add an onClick listener to the closest react node, and figure out which element was clicked, delegating actions from there.

If you would like to use Markdown in React, I recommend a pure react renderer, e.g. vjeux/markdown-react.