I have a vector consisting of full names with the first and last name separated by a comma this is what the first few elements look like:
> head(val.vec)
[1] "Aabye,ֲ Edgar" "Aaltonen,ֲ Arvo" "Aaltonen,ֲ Paavo"
[4] "Aalvik Grimsb,ֲ Kari" "Aamodt,ֲ Kjetil Andr" "Aamodt,ֲ Ragnhild
I am looking for a way to split them in to 2 separate columns of first and last name. My final intention is to have both of them as a part of a bigger data frame.
I tried using strsplit
function like this
names<-unlist(strsplit(val.vec,','))
but it gave me one long vector instead of 2 separate sets, I know it is Possible to use a loop and go over all the elements and place the first and last name in 2 separate vectors, but it is a little time consuming considering the fact that there are about 25000 records.
I saw a few similar questions but the discussion was how to do it on C+ and Java
If you are into the
dplyr
way of doing things, have a look atseparate
from thetidyr
package:Added in the call to
trimws
to get rid of the leading whitespace.Another option:
Which gives:
Should you want the result in a
data.frame
, you could wrap it inas.data.frame()
Just encase your function call into a
sapply
call:Using the solution you tried we can coerce it to two columns.
Testing it against the (very fast) solution proposed by Richard Scriven we can see yours and his are equivalent:
We can use
read.csv
to convert thevector
into adata.frame
with 2 columnsOr if we are using
strsplit
, instead ofunlist
ing (which will convert the wholelist
to a singlevector
), we can extract the first and second elements in thelist
separately to create twovector
s ('v1' and 'v2').Yet another option would be
sub
data