Appending row to Pandas DataFrame adds 0 column

I'm creating a Pandas DataFrame to store data. Unfortunately, I can't know the number of rows of data that I'll have ahead of time. So my approach has been the following.

First, I declare an empty DataFrame.

df = DataFrame(columns=['col1', 'col2'])

Then, I append a row of missing values.

df = df.append([None] * 2, ignore_index=True)

Finally, I can insert values into this DataFrame one cell at a time. (Why I have to do this one cell at a time is a long story.)

df['col1'][0] = 3.28

This approach works perfectly fine, with the exception that the append statement inserts an additional column to my DataFrame. At the end of the process the output I see when I type df looks like this (with 100 rows of data).

<class 'pandas.core.frame.DataFrame'>
Data columns (total 2 columns):
0            0  non-null values
col1         100  non-null values
col2         100  non-null values

df.head() looks like this.

      0   col1   col2
0  None   3.28      1
1  None      1      0
2  None      1      0
3  None      1      0
4  None      1      1

Any thoughts on what is causing this 0 column to appear in my DataFrame?

标签： python pandas append dataframe

2条回答

劳资没心，怎么记你

2楼-- · 2019-04-11 22:30

The append is trying to append a column to your dataframe. The column it is trying to append is not named and has two None/Nan elements in it which pandas will name (by default) as column named 0.

In order to do this successfully, the column names coming into the append for the data frame must be consistent with the current data frame column names or else new columns will be created (by default)

#you need to explicitly name the columns of the incoming parameter in the append statement
df = DataFrame(columns=['col1', 'col2'])
print df.append(Series([None]*2, index=['col1','col2']), ignore_index=True)


#as an aside

df = DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
dfRowImproper = [1,2,3,4]
#dfRowProper = DataFrame(arange(4)+1,columns=['A','B','C','D']) #will not work!!! because arange returns a vector, whereas DataFrame expect a matrix/array#
dfRowProper = DataFrame([arange(4)+1],columns=['A','B','C','D']) #will work


print df.append(dfRowImproper) #will make the 0 named column with 4 additional rows defined on this column

print df.append(dfRowProper) #will work as you would like as the column names are consistent

print df.append(DataFrame(np.random.randn(1,4))) #will define four additional columns to the df with 4 additional rows


print df.append(Series(dfRow,index=['A','B','C','D']), ignore_index=True) #works as you want

0人赞添加讨论(0) 举报

时光不老，我们不散

3楼-- · 2019-04-11 22:32

You could use a Series for row insertion:

df = pd.DataFrame(columns=['col1', 'col2'])
df = df.append(pd.Series([None]*2), ignore_index=True)
df["col1"][0] = 3.28

df looks like:

   col1 col2
0  3.28  NaN

0人赞添加讨论(0) 举报

Appending row to Pandas DataFrame adds 0 column

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间