I have the following indexed DataFrame with named columns and rows not- continuous numbers:
a b c d
2 0.671399 0.101208 -0.181532 0.241273
3 0.446172 -0.243316 0.051767 1.577318
5 0.614758 0.075793 -0.451460 -0.012493
I would like to add a new column, 'e'
, to the existing data frame and do not want to change anything in the data frame (i.e., the new column always has the same length as the DataFrame).
0 -0.335485
1 -1.166658
2 -0.385571
dtype: float64
I tried different versions of join
, append
, merge
, but I did not get the result I wanted, only errors at most. How can I add column e
to the above example?
I got the dreaded
SettingWithCopyWarning
, and it wasn't fixed by using the iloc syntax. My DataFrame was created by read_sql from an ODBC source. Using a suggestion by lowtech above, the following worked for me:This worked fine to insert the column at the end. I don't know if it is the most efficient, but I don't like warning messages. I think there is a better solution, but I can't find it, and I think it depends on some aspect of the index.
Note. That this only works once and will give an error message if trying to overwrite and existing column.
Note As above and from 0.16.0 assign is the best solution. See documentation http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.assign.html#pandas.DataFrame.assign Works well for data flow type where you don't overwrite your intermediate values.
Use the original df1 indexes to create the series:
Edit 2015
Some reported to get the
SettingWithCopyWarning
with this code.However, the code still runs perfect with the current pandas version 0.16.1.
The
SettingWithCopyWarning
aims to inform of a possibly invalid assignment on a copy of the Dataframe. It doesn't necessarily say you did it wrong (it can trigger false positives) but from 0.13.0 it let you know there are more adequate methods for the same purpose. Then, if you get the warning, just follow its advise: Try using .loc[row_index,col_indexer] = value insteadIn fact, this is currently the more efficient method as described in pandas docs
Edit 2017
As indicated in the comments and by @Alexander, currently the best method to add the values of a Series as a new column of a DataFrame could be using
assign
:Easiest ways:-