How to add a page break in word document generated

2019-01-11 09:55发布

问题:

I writing a Word document with R markdown in R Studio. I can get many things, but at the moment I am not figuring out how can I get a page break. I have found solutions but only for rendered latex / pdf document that it is not my case.

回答1:

There is an easier way by using a fifth-level header block (#####) and a docx template defined in YAML.

After creating headingfive.docx in Microsoft Word, you select Modify Style of the Heading 5, and then select Page break before in the Line and Page Breaks tab and save the headingfive.docx file.

---
title: 'Making page break using fifth-level header block'
output: 
  word_document:
    reference_docx: headingfive.docx
---

In your Rmd document, you define reference_docx in the YAML header, and now you can use the page-breaking #####.

Please see below.

https://www.r-bloggers.com/r-markdown-how-to-insert-page-breaks-in-a-ms-word-document/



回答2:

With the help of John MacFarlane and others on the pandoc google group, I put together a filter that does this. Please see: https://groups.google.com/forum/#!topic/pandoc-discuss/FzLrhk0vVbU In short, the filter needs to look for something to replace with the openxml for pagebreak. In this case \newpage is being replaced with <w:p><w:r><w:br w:type=\"page\"/></w:r></w:p> This allows for a single latex markup to be interpreted for both pdf and word output. Joel



回答3:

What you are trying to do is force a "page break" or "new page" in a word document generated with Pandoc. I have found a way to do this in my environment but I'm not sure it will work in every environment.

My environment: * R-studio / Pandoc / MS-WORD starting with an "*.Rmd" file and generating a DOCX file.

In my RMD file the key idea is that i've created what acts like a TEMPLATE document (MyFormattingDocument.docx) and in that word document I tweak the STYLES for things like "Heading 1" and/or "Heading 2" and or "footnote" or whatever other predefined styles I want to tweak.

(SEE THIS: http://rmarkdown.rstudio.com/word_document_format.html#style-reference ) for explanation of style reference and how to set the header information in your RMD file to specify a reference document.

SOOOO in my case... i tweak the "Heading 1" style in WORD to include a forced "Page Break Before" in the Paragraph formatting for "Heading 1". Exactly how you force every "Heading 1" to always "Page Break" is different in different versions of Microsoft WORD but if you follow the WORD documentation and modify the "Heading 1" style THEN every "Heading 1" will always have a pagebreak before it.

THEN... you save this template file in the some directory you're working from with the RMD file... and it is USED AS a template. THE CONTENTS of the file are ignored.... so don't worry... you can put sample text in this file and test that the formatting all works.... THE CONTENTS ARE IGNORED but the STYLES are USED in the new word document which will be built by the RMD file so.... then every "Heading 1" will have a break before it.

NOTE: You could obviously do the same with ANY style that has a one-to-one mapping from PANDOC MARKUP so you could instead just make all "Heading 3" or whatever.... just look at see in your RMD created DOCX what "STYLE" is being applied and then tweak that style even if you need to insert some "fake" lines with essentially blank content just for the purpose of forcing a style to appear in the DOCX



回答4:

Here is an R script that can be used as a pandoc filter to replace LaTeX breaks (\pagebreak) with word breaks, per @JAllen's answer above. With this you don't need to compile a pandoc script. Since you are working in R Markdown I assume one has R available in the system.

#!/usr/bin/env Rscript

json_in <- file('stdin', 'r')
lat_newp <- '{"t":"RawBlock","c":["latex","\\\\newpage"]}'
doc_newp <- '{"t":"RawBlock","c":["openxml","<w:p><w:r><w:br w:type=\\"page\\"/></w:r></w:p>"]}'
ast <- paste(readLines(json_in, warn=FALSE), collapse="\n")
ast <- gsub(lat_newp, doc_newp, ast, fixed=TRUE)
write(ast, "")

Save this as page-break-filter.R or something like that and make it executable by running chmod +x page-break-filter.R in the terminal.

Then include this filter the R Markdown YAML like so:

---
title: "Title
author: "Author"
output:  
  word_document:
    pandoc_args: [
      "--filter", "/path/to/pandoc-newpage-filter.R"
    ]
---


回答5:

You can use the R package worded. This avoids the need for a template word file. See https://github.com/davidgohel/worded.

The output parameter needs to be set to worded::rdocx_document and you need to call library(worded).

---
date: "2018-03-27"
author: "David Gohel"
title: "Document title"
output: 
  worded::rdocx_document
---

```{r setup, include=FALSE}
library(worded)
```

You can then add <!---CHUNK_PAGEBREAK---> to your document whenever you want a page break.

The package allows various word formatting options using a similar mechanism.



回答6:

Sungpil's article was close, but didn't quite work. This was the best solution I found for this: https://scriptsandstatistics.wordpress.com/2015/12/18/rmarkdown-how-to-inserts-page-breaks-in-a-ms-word-document/

Even better, the author included the Word template to make this work. The R-blogger's link to his template is broken, and the header is formatted wrong. Some notes I took:

1) You might need to include the whole path to the word template in your Rmd header, like so:

output: 
    word_document:
      reference_docx: C:/workspace/myproject/mystyles.docx

2) The template at the link above changed some of the default style settings so you'll need to change them back



回答7:

My solution is not very robust but can work for some of us. Assuming you need a page break before each level 1 title in your word document, I defined this in the format template used in the yaml field reference_docx: . In this document you modify the Heading 1 format (or equivalent) to insert a page break before the Title. Do not forget to start your template with the first docx rendered with knitr (pandoc) in RStudio.



回答8:

It is not an automated solution. But I have been adding the text '#####page break' to my markdown document. Then in MS Word using find-replace to replace the text "page break" with "^m" (manual page break).



回答9:

Ok, I found this in the markdown docs.

Horizontal Rule / Page Break

Three or more asterisks *** or dashes ---.