Read workbook properties using python and xlrd

2019-05-04 19:43发布

问题:

Is there a way to read the excel file properties using xlrd? I refer not to cell presentation properties, but general workbook properties.

Thanks a lot in advance.

回答1:

Apart from the username (last person to save the worksheet) the Book instance as returned by open_workbook does not seem to have any properties.

I recursively dumped the Book ( dumping its dict if a xlrd.BaseObject) and could not find anything in that way. The test files for sure had an author, company and some custom metadata.

FWIW: LibreOffice does not seem to be able to find author and company either (or does not display them), but it does show custom metadata in the properties.



回答2:

I couldn't find a way to do this with xlrd, but if you only have to read .xlsx files, you can treat them as a Zipfile and read the properties XML file(s). You can see this by changing the .xlsx extension to .zip and opening the file on Windows. An example of reading custom defined properties is below.

from lxml import etree as ET
import zipfile    

def get_custom_properties(filename):
    zip = zipfile.ZipFile(filename)
    props = zip.open('docProps/custom.xml')
    text = props.read()
    xml = ET.fromstring(text)
    # Works on my example document, but I don't know if every 
    # child node will always have exactly one nested node
    return {
        child.attrib['name']: child[0].text
        for child in xml
    }


标签: python xlrd