Number of pages of a word document with Python

2019-01-26 23:23发布

问题:

Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ?

And for an .odt file ?

I want to use this for a web application based on Web2py on Linux.

Thank you !

回答1:

You can read the value

<Properties>
<Pages>CountValue</Pages>

from docProps/app.xml in the docx package or

<office:document-meta>
    <office:meta>
        <meta:document-statistic meta:page-count="CountValue">

form meta.xml in odt package.

If these values ​​do not exist (they are optional), you have to make a calculation of the entire document, in fact perform rendering, that much more difficult



回答2:

Only for those who search for this blog entry....

from win32com.client import Dispatch
#open Word
word = Dispatch('Word.Application')
word.Visible = False
word = word.Documents.Open(doc_path)

#get number of sheets
word.Repaginate()
num_of_sheets = word.ComputeStatistics(2)