I have a dataset of 25 variables and over 2 million observations. One of my variables is a combination of a few different "categories" that I want to split to where it shows 1 category per column (similar to what split would do in stata). For example:
# Name Age Number Events First
# Karen 24 8 Triathlon/IM,Marathon,10k,5k 0
# Kurt 39 2 Half-Marathon,10k 0
# Leah 18 0 1
And I want it to look like:
# Name Age Number Events_1 Event_2 Events_3 Events_4 First
# Karen 24 8 Triathlon/IM Marathon 10k 5k 0
# Kurt 39 2 Half-Marathon 10k NA NA 0
# Leah 18 0 NA NA NA NA 1
I have looked through stackoverflow but have not found anything that works (everything gives me an error of some sort). Any suggestions would be greatly appreciated.
Note: May not be important but the largest number of categories 1 person has is 19 therefore I would need to create Event_1:Event_19
Comment: Previous stack overflows have suggested the separate function, however this function does not seem to work with my dataset. When I input the function the program runs but when it is finished nothing is changed, there is no output, and no error code. When I tried to use other suggestions made in other threads I received error messages. However, I finally got it is work by using the cSplit function. Thank for the help!!!