How can I convert all columns from my Excel file u

2019-09-21 06:52发布

问题:

I want to convert all columns (59 columns) from my excel file to a dataframe, specifying the types. Some columns are a string, others dates, other int and more. I know I can use a converter in a read_excel method. but I have a lot of columns and I don't want write converter={'column1': type1, 'column2': type2, ..., 'column59': type59}

my code is:

import numpy as np
import pandas as pd
import recordlinkage
import xrld

fileName = 'C:/Users/Tito/Desktop/banco ZIKA4.xlsx'
strcols = [0, 5, 31, 36, 37, 38, 39, 40, 41, 45]
datecols = [3, 4, 29, 30, 32, 48, 50, 51, 52, 53, 54, 55]
intcols = [33, 43, 59]
booleancols = [6, ..., 28]
df = pd.read_excel(fileName, sheet_name=0, true_values=['s'], false_values=['n'], converters={strcols: str, intcols: np.int, booleancols: np.bool, datecols: pd.to_datetime})
print(df.iat[1, 31], df.iat[1, 32], df.iat[1, 33])

回答1:

Iiuc your code doesn't work because the converters kwarg doesn't allow lists of several columns as keys to functions.

What you can do is to create dicts instead of lists and provide the concatenated dicts to converters:

strcols = {c: str for c in [0, 5, 31, 36, 37, 38, 39, 40, 41, 45]}
datecols = {c: pd.to_datetime for c in [3, 4, 29, 30, 32, 48, 50, 51, 52, 53, 54, 55]}
intcols = {c: np.int for c in [33, 43, 59]}
booleancols = {c: np.bool for c in range(6, 29)}
conv_fcts = {**strcols, **datecols, **intcols, **booleancols}

df = pd.read_excel(fileName, converters=conv_fcts, sheet_name=0, true_values=['s'], false_values=['n'])