Is there a way to read the excel file properties using xlrd? I refer not to cell presentation properties, but general workbook properties.
Thanks a lot in advance.
Is there a way to read the excel file properties using xlrd? I refer not to cell presentation properties, but general workbook properties.
Thanks a lot in advance.
Apart from the username (last person to save the worksheet) the Book instance as returned by open_workbook does not seem to have any properties.
I recursively dumped the Book ( dumping its dict if a xlrd.BaseObject) and could not find anything in that way. The test files for sure had an author, company and some custom metadata.
FWIW: LibreOffice does not seem to be able to find author and company either (or does not display them), but it does show custom metadata in the properties.
I couldn't find a way to do this with xlrd, but if you only have to read .xlsx files, you can treat them as a Zipfile and read the properties XML file(s). You can see this by changing the .xlsx extension to .zip and opening the file on Windows. An example of reading custom defined properties is below.
from lxml import etree as ET
import zipfile
def get_custom_properties(filename):
zip = zipfile.ZipFile(filename)
props = zip.open('docProps/custom.xml')
text = props.read()
xml = ET.fromstring(text)
# Works on my example document, but I don't know if every
# child node will always have exactly one nested node
return {
child.attrib['name']: child[0].text
for child in xml
}