Is there a workaround for adding headers and footers in a Microsoft Word (docx
) file?
These are not implemented in versions of python-docx
prior to 0.8.8
.
More specifically, I would like to add:
- Page number to footers
- Some random text to headers
The ideal code will look as follows:
from docx import Document
document = Document()
# Add header and footer on all pages
document.save("demo.docx")
How about something like this (Thanks to Eliot K)
from docx import Document
import win32com.client as win32
import os.path
import tempfile
tempdir = tempfile.gettempdir()
msword = win32.gencache.EnsureDispatch('Word.Application')
tempfile = os.path.join(tempdir, "temp.doc")
document = Document()
document.save(tempfile)
doc = msword.Documents.Open(tempfile)
doc.Sections(1).Footers(1).Range.Text = r'Text to be included'
doc.Sections(1).Footers(1).PageNumbers.Add()
doc.SaveAs(tempfile, FileFormat = 0)
document = Document(tempfile)
Not the most elegant approach perhaps, but should do what you need it to.
Maybe sequester the ugly save/load code in a function somewhere in a dusty corner of your code ;-)
Again, does require a windows machine with microsoft office installed.
Good luck!
The template approach works and its major advantage is that it is a truly cross-platform solution. However, it requires that a style has already been applied once in the document.
Let's consider a (simplified) version of the toy example from the python-docx
documentation page.
The first step involves creating the template document:
from docx import Document
document = Document()
document.add_heading('Document Title', 0)
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
document.add_heading('Heading, level 1', level=1)
document.add_paragraph('Intense quote', style='IntenseQuote')
document.add_paragraph(
'first item in unordered list', style='ListBullet'
)
document.add_paragraph(
'first item in ordered list', style='ListNumber'
)
document.save('demo.docx')
(Note that you can also apply the styles manually in this first step without using python-docx
, that is from within Word.)
Next, you open this demo.docx
in Microsoft Word where you:
- add the desired header
- insert the page numbers from the menu
- save the document
Once you have done the above, you simply delete the main contents of the demo.docx
document (but not the content of the header and footer!) and save the file again.
In the second step, you call demo.docx
using python-docx
to make the changes you need:
from docx import Document
document = Document('demo.docx')
document.add_heading('A New Title for my Document', 0)
p = document.add_paragraph('A new paragraph having some plain ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
document.add_heading('New Heading, level 1', level=1)
document.add_paragraph('Intense quote', style='IntenseQuote')
document.add_paragraph(
'first new item in unordered list', style='ListBullet'
)
document.add_paragraph(
'first new item in ordered list', style='ListNumber'
)
document.save('demo.docx')
You can even make further content additions, such as a table with an existing table style:
from docx import Document
document = Document('demo.docx')
document.add_page_break()
recordset = [ [1, "101", "Spam"], [2, "42", "Eggs"], [3, "631", "Spam, spam, eggs, and spam"]]
table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for item in recordset:
row_cells = table.add_row().cells
row_cells[0].text = str(item[0])
row_cells[1].text = str(item[1])
row_cells[2].text = item[2]
table.style = 'ColorfulShading'
document.save('demo.docx')
Of course, one can avoid repeating the first step all the time, by copying the customized file and then making the necessary changes there (e.g. demo_copy.docx
) without affecting the template:
import shutil
shutil.copyfile('demo.docx', 'demo_copy.docx')
Finally, it is worth mentioning that you can also use customized styles! For an example of how to do this using python-docx
and table styles see here.
It's not the most elegant (it requires you to navigate between VBA and Python), but you can use the win32com library to tap into the MS Word functionality. This of course requires a Windows machine with MS Office installed.
import win32com.client as win32
msword = win32.gencache.EnsureDispatch('Word.Application')
doc = msword.Documents.Add
doc.Sections(1).Footers(1).Range.Text = r'Text to be included'
doc.Sections(1).Footers(1).PageNumbers.Add()
One of the workarounds you could use is utilizing a template document created within Word. Create a blank document, add whatever header with text you want and footer with page numbers and save the document. Then use:
from docx import Document
document = Document("template.docx")
# Do your editing
document.save("demo.docx")
... and you should be able to edit everything else while preserving the header and footer.
Ultimately I think this solution would work great for the page numbers issue. If you need unique header text for each document things could get a bit tricky. If that is the case, you might want to try editing the XML of the docx file directly. You can use this in the terminal:
unzip template.docx
... to get the docx XML files spat out into the directory. You can also use zipfile to do it within python:
import zipfile
document = zipfile.ZipFile("template.docx")
for xml in document.filelist:
if "header" in xml.filename:
read = document.read(xml.filename)
print(read.decode())
The print statement will print the entire XML file, but you should be able to find this tidbit:
<w:r><w:t>ThisIsMyHeader</w:t></w:r>
Which will be the text in your header. All you will have to do is edit the XML file, combine the files back together, and then change the file type back to docx.
This is obviously a super hacky workaround and unfortunately I have not been able to get it to fully work, but it at least would be a good step in the right direction if you absolutely need it.
Good luck!
I've been using it to work
from docx import Document
document = Document()
header = document.sections[0].header
header.add_paragraph('Test Header')
header = document.sections[0].footer
header.add_paragraph('Test Footers')
https://python-docx.readthedocs.io/en/latest/dev/analysis/features/header.html