In Python 3 and pandas I have a dataframe with a column cpf with codes
candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 26245 entries, 0 to 1063
Data columns (total 7 columns):
uf 26245 non-null object
cargo 26245 non-null object
nome_completo 26245 non-null object
cpf 26245 non-null object
nome_urna 26245 non-null object
partido_eleicao 26245 non-null object
situacao 26245 non-null object
dtypes: object(7)
memory usage: 1.6+ MB
The codes are numbers like these: "00229379273", "84274662268", "09681949153", "53135636534"...
I saved as CSV
candidatos_2014.to_csv('candidatos_2014.csv')
I use Ubuntu and LibreOffice. But when I opened the file the cpf column does not show the leading zeros:
"229379273", "9681949153"
Please, is there a way to save a CSV that keeps zeros to the left in a column that only has numbers?
Specify dtype as string while reading the csv file as below:
# if you are reading data with leading zeros
candidatos_2014 = pd.read_csv('candidatos_2014.csv', dtype ='str')
or convert data column into string
# if data is generated in python you can convert column into string first
candidatos_2014['cpf'] = candidatos_2014['cpf'].astype('str')
candidatos_2014.to_csv('candidatos_2014.csv')
First, make sure that output in your csv file does not have zeros.
If it does, but you are opening that file in Excel or another spreadsheet, you still sometimes can see values without leading zeros.
In this case, Go to Data menu, then Import form Text.
Excel's import utility will give you options to define each column's data type.
I am sure that it should be similar in other apps.
Hope it helps!
TLDR: you don't have to do anything if your pandas columns are type object
I feel like both answers here, but especially the accepted answer, are confusing. The short answer is that, if the dtype
of your column is object
, then pandas will write it with leading zeros. There's nothing to do.
If like me, you came here because you didn't know that for sure and when you opened the CSV, the leading zeros were gone, then follow Ivan S's advice -- take a look at the file you wrote to verify, but you should see the leading zeros there.
If you do, then both answers give guidance on how to read the data back in preserving leading zeros.
If you don't, the the datatype wasn't correct in pandas when you saved the CSV. Just changing that column using astype
wouldn't restore the zeros. You'd also need to use str.zfill
as described in this SO answer.