I have a pandas data frame like this:
Column1 Column2 Column3 Column4 Column5
0 a 1 2 3 4
1 a 3 4 5
2 b 6 7 8
3 c 7 7
What I want to do now is getting a new dataframe containing Column1 and a new columnA. This columnA should contain all values from columns 2 -(to) n (where n is the number of columns from Column2 to the end of the row) like this:
Column1 ColumnA
0 a 1,2,3,4
1 a 3,4,5
2 b 6,7,8
3 c 7,7
How could I best approach this issue? Any advice would be helpful. Thanks in advance!
I propose to use
.assign
it's simple, maybe long but it worked for me
You can call
apply
passaxis=1
toapply
row-wise, then convert the dtype tostr
andjoin
:Here I call
dropna
to get rid of theNaN
, however we need to cast again toint
so we don't end up with floats as str.If you have lot of columns say - 1000 columns in dataframe and you want to merge few columns based on
particular column name
e.g. -Column2
in question and arbitrary no. of columns after that column (e.g. here 3 columns after'Column2
inclusive ofColumn2
as OP asked).We can get position of column using
.get_loc()
- as answered hereTo remove
NaN
, use.dropna()
or.fillna()
Hope it helps!