Given the dataframe df
:
x <- c("X1", "X2", "X3", "X4", "X5")
y <- c("00L0", "0", "00012L", "0123L0", "0D0")
df <- data.frame(x, y)
How can I leverage tidyr::separate
to put each character of the y
strings into a separate column (one column per string position)?
Desired output:
x <- c("X1", "X2", "X3", "X4", "X5")
m1 <- c(0, 0, 0, 0, 0)
m2 <- c(0, NA, 0, 1, "D")
m3 <- c("L", NA, 0, 2, 0)
mN <- c(NA, NA, NA, NA, NA)
df <- data.frame(x, m1, m2, m3, mN)
Where mN could theoretically go up to m100 (100 columns), or higher.
You can split the string in column y into individual characters using strsplit:
Starting with your data frame:
I solved the problem of putting these characters into columns by:
First: Use ddply to split all the strings in column y and put these in separate rows
Second: Use reshape to convert the rows with same x-value into columns
This may be over-complicating the problem, but it is the only way I could get it to work!
Here is a base R method.
which results in
I used stringsAFactors=FALSE in the creation of the df:
But, if I didn't, this code would result in an error as @m0h3n points out. The without this alternative data.frame construction, it is necessary to wrap df$y in
as.character
to coerce the variable from a factor to a character:Thanks @m0h3n for pointing this out.
This works. It fills with blanks rather than
NA
s, but you can change that post-hoc if you prefer. (fill = 'right'
only works when splitting on a character vector, not explicit positions.)Here is another
base R
option where we create a delimiter,
between each character of the 'y' column usinggsub
and then read it withread.csv
Or use
tstrsplit
fromdata.table