figure and figcaption not converting correctly to

2019-08-17 22:19发布

I have HTML5 source code (with full doctype/head/body) that needs to be converted to a Word DOCX file. The HTML file is generated page with minimal formatting (H1/H2/P) and images (img).

There is a FIGURE that contains the image source (SRC) parameter, and then there is a FIGCAPTION tag that contains the caption for the image, similar to this (from https://www.w3schools.com/tags/tag_figcaption.asp ):

 <figure>
  <img src="img_pulpit.jpg" alt="The Pulpit Rock" width="304" height="228">
  <figcaption>Fig1. - A view of the pulpit rock in Norway.</figcaption>
</figure> 

The image and caption shows properly when the HTML5 page is viewed in a browser.

The issue is importing that HTML5 document into Word 2010 DOCX document (via File, Open, then File, Save As a DOCX). The caption (figcaption) is not converted into a DOCX image caption, but is displayed separately (outside) of the image. If you look at the image's attributes (in Word), the caption is not there; the caption is just text that is not 'part of' the image.

How do I get the figcaption text to be a caption in an image in the DOCX file?

(I don't have HTML-to-DOCX converters availabe - like Pandoc; I have tried several HTML-to-DOCX JS converters, and they don't solve the problem. Note that this issue is not with displaying the HTML in a browser, but in the conversion of HTML into DOCX when there are figure/figcaption tags.)

Added: the intent is to get pictures with their captions into the DOCX with additional text content. Pictures need to be side-by-side, not in separate 'rows'.

0条回答
登录 后发表回答