I am trying to column-bind dataframes and having issue with pandas concat
, as ignore_index=True
doesn't seem to work:
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 2, 3,4])
df2 = pd.DataFrame({'A1': ['A4', 'A5', 'A6', 'A7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D2': ['D4', 'D5', 'D6', 'D7']},
index=[ 5, 6, 7,3])
df1
# A B D
# 0 A0 B0 D0
# 2 A1 B1 D1
# 3 A2 B2 D2
# 4 A3 B3 D3
df2
# A1 C D2
# 5 A4 C4 D4
# 6 A5 C5 D5
# 7 A6 C6 D6
# 3 A7 C7 D7
dfs = [df1,df2]
df = pd.concat( dfs,axis=1,ignore_index=True)
print df
and the result is
0 1 2 3 4 5
0 A0 B0 D0 NaN NaN NaN
2 A1 B1 D1 NaN NaN NaN
3 A2 B2 D2 A7 C7 D7
4 A3 B3 D3 NaN NaN NaN
5 NaN NaN NaN A4 C4 D4
6 NaN NaN NaN A5 C5 D5
7 NaN NaN NaN A6 C6 D6
Even if I reset index using
df1.reset_index()
df2.reset_index()
and then try
pd.concat([df1,df2],axis=1)
it still produces the same result!
The ignore_index option is working in your example, you just need to know that it is ignoring the axis of concatenation which in your case is the columns. (Perhaps a better name would be ignore_labels.) If you want the concatenation to ignore the index labels, then your axis variable has to be set to 0 (the default).
Agree with the comments, always best to post expected output.
Is this what you are seeking?
Thanks for asking. I had the same issue. For some reason "ignore_index=True" doesn't help in my case. I wanted to keep index from the first dataset and ignore the second index a this worked for me
If I understood you correctly, this is what you would like to do.
Which gives:
Actually, I would have expected that
df = pd.concat(dfs,axis=1,ignore_index=True)
gives the same result.This is the excellent explanation from jreback: