We seem to be seeing more and more questions about executing awk on Excel spreadsheets so here is a Q/A on how to do that specific thing.
I have this information in an Excel spreadsheet "$D/staff.xlsx"
(where "$D"
is the path to my Desktop):
Name Position
Sue Manager
Bill Secretary
Pat Engineer
and I want to print the Position field for a given Name, e.g. output Secretary
given the input Bill
.
I can currently save as CSV from Excel to get:
$ cat "$D/staff.csv"
Name,Position
Sue,Manager
Bill,Secretary
Pat,Engineer
and then run:
$ awk -F, -v name="Bill" '$1==name{print $2}' "$D/staff.csv"
Secretary
but this is just a small part of a larger task and so I have to be able to do this automatically from a shell script without manually opening Excel to export the CSV file. How do I do that from a Windows PC running cygwin?
The combination of the following VBS and shell scripts create a CSV file for each sheet in the Excel spreadsheet:
.
Now we execute the shell script which internally calls cygstart to run the VBS script to generate the CSV files (one per sheet) in a subdirectory under the same directory where the Excel file exists named based on the Excel file name (e.g. Excel file
staff.xlsx
produces CSVs directorystaff.csvs
):There is only one sheet with the default name
Sheet1
in the target Excel file"$D/staff.xlsx"
so the output of the above is a file"$D/staff.csvs/Sheet1.csv"
:Also see What's the most robust way to efficiently parse CSV using awk? for how to then operate on those CSVs.
See also https://stackoverflow.com/a/58879683/1745001 for how to do the opposite, i.e. call a cygwin bash command from a Windows batch file.