How to save a CSV from dataframe, to keep zeros le

2020-07-10 11:31发布

问题:

In Python 3 and pandas I have a dataframe with a column cpf with codes

candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 26245 entries, 0 to 1063
Data columns (total 7 columns):
uf                 26245 non-null object
cargo              26245 non-null object
nome_completo      26245 non-null object
cpf                26245 non-null object
nome_urna          26245 non-null object
partido_eleicao    26245 non-null object
situacao           26245 non-null object
dtypes: object(7)
memory usage: 1.6+ MB

The codes are numbers like these: "00229379273", "84274662268", "09681949153", "53135636534"...

I saved as CSV

candidatos_2014.to_csv('candidatos_2014.csv')

I use Ubuntu and LibreOffice. But when I opened the file the cpf column does not show the leading zeros:

"229379273", "9681949153"

Please, is there a way to save a CSV that keeps zeros to the left in a column that only has numbers?

回答1:

Specify dtype as string while reading the csv file as below:

# if you are reading data with leading zeros
candidatos_2014 = pd.read_csv('candidatos_2014.csv', dtype ='str')

or convert data column into string

# if data is generated in python you can convert column into string first
candidatos_2014['cpf'] = candidatos_2014['cpf'].astype('str')
candidatos_2014.to_csv('candidatos_2014.csv')


回答2:

First, make sure that output in your csv file does not have zeros. If it does, but you are opening that file in Excel or another spreadsheet, you still sometimes can see values without leading zeros. In this case, Go to Data menu, then Import form Text. Excel's import utility will give you options to define each column's data type.

I am sure that it should be similar in other apps.

Hope it helps!



回答3:

TLDR: you don't have to do anything if your pandas columns are type object

I feel like both answers here, but especially the accepted answer, are confusing. The short answer is that, if the dtype of your column is object, then pandas will write it with leading zeros. There's nothing to do.

If like me, you came here because you didn't know that for sure and when you opened the CSV, the leading zeros were gone, then follow Ivan S's advice -- take a look at the file you wrote to verify, but you should see the leading zeros there.

If you do, then both answers give guidance on how to read the data back in preserving leading zeros.

If you don't, the the datatype wasn't correct in pandas when you saved the CSV. Just changing that column using astype wouldn't restore the zeros. You'd also need to use str.zfill as described in this SO answer.