I'm using Pandas to load an Excel spreadsheet which contains zip code (e.g. 32771). The zip codes are stored as 5 digit strings in spreadsheet. When they are pulled into a DataFrame using the command...
xls = pd.ExcelFile("5-Digit-Zip-Codes.xlsx")
dfz = xls.parse('Zip Codes')
they are converted into numbers. So '00501' becomes 501.
So my questions are, how do I:
a. Load the DataFrame and keep the string type of the zip codes stored in the Excel file?
b. Convert the numbers in the DataFrame into a five digit string e.g. "501" becomes "00501"?
or
are 2 of many many ways to do this
You can avoid panda's type inference with a custom converter, e.g. if
'zipcode'
was the header of the column with zipcodes:This is arguably a bug since the column was originally string encoded, made an issue here
As a workaround, you could convert the
int
s to 0-padded strings of length 5 usingSeries.str.zfill
:Demo:
yields