Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ?
And for an .odt file ?
I want to use this for a web application based on Web2py on Linux.
Thank you !
Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ?
And for an .odt file ?
I want to use this for a web application based on Web2py on Linux.
Thank you !
You can read the value
<Properties>
<Pages>CountValue</Pages>
from docProps/app.xml in the docx package or
<office:document-meta>
<office:meta>
<meta:document-statistic meta:page-count="CountValue">
form meta.xml in odt package.
If these values do not exist (they are optional), you have to make a calculation of the entire document, in fact perform rendering, that much more difficult
Only for those who search for this blog entry....
from win32com.client import Dispatch
#open Word
word = Dispatch('Word.Application')
word.Visible = False
word = word.Documents.Open(doc_path)
#get number of sheets
word.Repaginate()
num_of_sheets = word.ComputeStatistics(2)