I have a Python data frame which includes a column called "SEGMENT". I want to break the column up into three columns. Please see my desired output highlighted in yellow.
Below is the code I have tried. Unfortunately I can't even get the first replace statement to work. The : does not get replaced by -. Any help is greatly appreciated!
df_stack_ranking['CURRENT_AUM_SEGMENT'] = df_stack_ranking['CURRENT_AUM_SEGMENT'].replace(':', '-')
s = df_stack_ranking['CURRENT_AUM_SEGMENT'].str.split(' ').apply(Series, 1).stack()
s.index = s.index.droplevel(-1)
s.name = 'SEGMENT'
df_stack_ranking.join(s.apply(lambda x: Series(x.split(':'))))
Setup
Solution
use split to break the column to 3 parts and then expand to create a new DF.
Replace colon with a hyphen and then split on hyphen to get list of values for the 3 columns. Then assign values to each of the 3 columns and delete the temporary column.
Use
str.split
by:
or(|)
\s*-\s*
(\s*
means zero or more whitespaces):Solution with
str.extract
:Because I like naming columns from the
str.extract
regexSetup
I would do this with the str.extract using regex