The col output in xlrd printing something with app

2019-07-17 15:05发布

问题:

I am using XLRD to attempt to read from and manipulate string text encapsulated within the cells of my excel document. I am posting my code, as well as the text that is returned when I choose to print a certain column.

import xlrd
data = xlrd.open_workbook('data.xls')
sheetname = data.sheet_names()
employees = data.sheet_by_index(0)

print employees.col(2)

>>>[text:u'employee_first', text:u'\u201cRichard\u201d', text:u'\u201cCatesby\u201d', text:u'\u201cBrian\u201d']

My intention is to create a dict or either reference the excel documents using strings in python. I would like to have a number of my functions in my program manipulate the data locally and then output at a later point (not within the scope of this question) to a second excel file.

How do I get rid of this extra information?

回答1:

If you are only interested in the values of the cells, then you should do:

values = sheet.col_values(colx=2)

instead of:

cells = sheet.col(colx=2)
values = [c.value for c in cells]

because it's more concise and more efficient (Cell objects are constructed on the fly as/when requested).



回答2:

employees.col(2) is a list of xlrd.sheet.Cell instances. To get all the values from the column (instead of the Cell objects), you can use the col_values method:

values = employees.col_values(2)

You could also do this (my original suggestion):

values = [c.value for c in employees.col(2)]

but that is much less efficient than using col_values.

\u201c and \u201d are unicode left and right double quotes, respectively. If you want to get rid of those, you can use, say, the lstrip and rstrip string methods. E.g. something like this:

values = [c.value.lstrip(u'\u201c').rstrip(u'\u201d') for c in employees.col(2)]


标签: python xlrd