Embed indented HTML in Markdown with Pandoc

2020-07-13 13:07发布

问题:

I have some embedded HTML in my Markdown (bulleted list within a table). Is there a way to indent my HTML without Pandoc treating it as a verbatim code block?

回答1:

Sort of, but you have to change Pandoc's defaults.

The Markdown rules expressly prohibit this:

The only restrictions are that block-level HTML elements — e.g. <div>, <table>, <pre>, <p>, etc. — must be separated from surrounding content by blank lines, and the start and end tags of the block should not be indented with tabs or spaces.

However, if you notice, the above quoted rule does specifically state that only "the start and end tags of the block should not be indented." There is no restriction from indenting the content inside the "the start and end tags". In fact, the content between ""the start and end tags" is not even processed as Markdown, so feel free to indent away. In other words, this is completely acceptable:

<table>
    <thead>
        <tr>
            <th>A header</th>
        </tr>
    </thead>
</table>

Except that it doesn't work in Pandoc by default. As explained in Pandoc's documentation:

Standard Markdown allows you to include HTML “blocks”: blocks of HTML between balanced tags that are separated from the surrounding text with blank lines, and start and end at the left margin. Within these blocks, everything is interpreted as HTML, not Markdown; so (for example), * does not signify emphasis.

Pandoc behaves this way when the markdown_strict format is used; but by default, pandoc interprets material between HTML block tags as Markdown.

Therefore you either need to use the raw_html extension or the markdown_strict output format.

For "strict mode" use:

pandoc --from markdown_strict

Or to not use strict mode but still get the HTML behavior you want (disable markdown_in_html_blocks extension and enable raw_html extension):

pandoc --from markdown-markdown_in_html_blocks+raw_html