I've got a data frame:
V1 V2 V3 V4 V5 V6 V7
a F B C D B A T
b R D C D F A T
c A C C R F A T
In every row I want to replace values in columns V3:V7 that matches column V2 with value in column V1. It should look like this.
V1 V2 V3 V4 V5
a C D F A T
b C R F A T
c A R F A T
How can I do this?
This should work as long as your data are strings and not factors:
for(i in 3:7){
j = data[,2]==data[,i]
data[j,i] = data[j,1]
}
Using a combination of lapply
and ifelse
, you can do:
mydf[,3:7] <- lapply(mydf[,3:7], function(x) ifelse(x==mydf$V2, mydf$V1, x))
which gives:
> mydf
V1 V2 V3 V4 V5 V6 V7
a F B C D F A T
b R D C R F A T
c A C A R F A T
Or:
newdf <- data.frame(sapply(mydf[,3:7], function(x) ifelse(x==mydf$V2, mydf$V1, x)))
which gives:
> newdf
V3 V4 V5 V6 V7
1 C D F A T
2 C R F A T
3 A R F A T
Here is another method using lapply
:
df[, 3:7] <- lapply(df[,3:7], function(i) {i[i == df$V2] <- df$V1[i == df$V2]; i})
df
V1 V2 V3 V4 V5 V6 V7
a F B C D F A T
b R D C R F A T
c A C A R F A T
For each variable, matches are substituted using subsetting.
This same method may be used the the replace
function:
df[, 3:7] <- lapply(df[,3:7],
function(i) replace(i, i == df$V2, df$V1[i == df$V2]))
As with the solution of @mr-rip, these variables must be stored as character and not factor for this to work.
This also works with data.table
:
library(data.table)
setDT(df)[, lapply(.SD, function(col) ifelse(col == V2, V1, col))][, V3:V7, with=F]
# V3 V4 V5 V6 V7
# 1: C D F A T
# 2: C R F A T
# 3: A R F A T