Programmatically extract data from an Excel spread

2020-05-27 11:28发布

Is there a simple way, using some common Unix scripting language (Perl/Python/Ruby) or command line utility, to convert an Excel Spreadsheet file to CSV? Specifically, this one:

http://www.econ.yale.edu/~shiller/data/ie_data.xls

And specifically the third sheet of that spreadsheet (the first two being charts).

10条回答
来,给爷笑一个
2楼-- · 2020-05-27 11:39

There is a really good Perl library for xls reading: Spreadsheet::ParseExcel.

查看更多
来,给爷笑一个
3楼-- · 2020-05-27 11:39

for ruby, the spreadsheet gem is excellent to read write modify, ...excell files

https://github.com/zdavatz/spreadsheet

查看更多
够拽才男人
4楼-- · 2020-05-27 11:41

For python, there are a number of options, see here, here and here. Note that the last option will only work on Windows with Excel installed.

查看更多
叼着烟拽天下
5楼-- · 2020-05-27 11:42

I may have found an acceptable answer already:

xls2csv

But interested to hear what other options there are, or about tools in other languages.

查看更多
戒情不戒烟
6楼-- · 2020-05-27 11:42

With pyexcel library, you can do this:

>>> import pyexcel as p
>>> data_sheet=p.get_sheet(file_name='/Users/jaska/Downloads/ie_data.xls', sheet_name='Data')
>>> data_sheet.top_left()
pyexcel sheet:
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |            |   |   |   |   |   |            |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
| Stock Market Data Used in "Irrational Exuberance" Princeton University Press, 2000, 2005, 2015, updated |   |   |   |            |   |   |   |   |   | Cyclically |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
| Robert J. Shiller                                                                                       |   |   |   |            |   |   |   |   |   | Adjusted   |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |            |   |   |   |   |   | Price      |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
|                                                                                                         |   |   |   |   Consumer |   |   |   |   |   | Earnings   |   |   |   |   |   |   |
+---------------------------------------------------------------------------------------------------------+---+---+---+------------+---+---+---+---+---+------------+---+---+---+---+---+---+
>>> data_sheet.save_as('ie_data.csv')

And for it to work, you need to install:

$ pip install pyexcel
$ pip install pyexcel-xls

What's more, you can install pyexcel-cli in addition and get your csv data in one command line:

$ pyexcel transcode --sheet-name 'Data' /your/home/Downloads/ie_data.xls ie_data.csv
查看更多
姐就是有狂的资本
7楼-- · 2020-05-27 11:47

This is quite late to the game, but I thought I'd add another option via Ruby using the gem "roo":

    require 'rubygems'
    require 'roo'

    my_excel_file = Excelx.new("path/to/my_excel_file.xlsx")
    my_excel_file.default_sheet = my_excel_file.sheets[2]
    my_excel_file.to_csv("path/to/my_excel_file.csv")
查看更多
登录 后发表回答