I've mostly only used xlwings to open (read-write
) workbooks (since the workbooks I read have complicated macros). But I've recently begun using openpyxl to open (read-only
) workbooks when I've needed to read thousands of workbooks to scrape some data.
I've noticed that there is a considerable difference between how xlwings and openpyxl read workbooks. I believe xlwings relies on pywin32
to read workbooks. When you read a workbook with xlwings.Book(<filename>)
the actual workbook opens up. I have a feeling this is a result of pywin32
.
However, when using openpyxl.load_workbook(<filename>)
a workbook window does not appear. I have a feeling this is a result of pywin32
.
Beyond this, I've no further understanding how the backends work for each libraries. Could someone shine some light on this? Is there a benefit/cost to relying on xlwings
and pywin32
for reading workbooks, as opposed to openpyxl
which does not seem to use pywin32
?
You are correct in that
xlwings
relies onpywin32
, whereasopenpyxl
does not.openpyxl
A ".xlsx" excel file is essentially a zip-file containing multiple XML files formatted according to Microsoft's OOXML specification. With this specification it's possible to create a program capable of directly reading/writing excel files in just about any programming language. This is the approach applied in
openpyxl
: it uses python code to read/write excel files directly.xlwings
A Microsoft Excel application can be started and controlled by an external program through the Win32 COM API. The
pywin32
package provides an interface between Win32 COM and Python. Through a python script with the right pywin32 commands you can fully control an Excel Application (open excel files, query data from cells, write data to cells, save excel files, etc.). Thepywin32
commands that you can use mirror the Excel VBA commands, albeit with python syntax.xlwings
is (among other things) a user-friendly wrapper aroundpywin32
. It introduces several concise-yet-powerful methods. An example would be the methods for direct conversion of an excel cell range to a numpy array or pandas dataframe (and vice versa).Summary
A fundamental difference between
xlwings
andopenpyxl
is that the former requires that MS Excel is installed on your machine, whereas the latter does not.