可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I want to use the apply function on a dataframe, but only apply the function to the last 5 columns.
B<- by(wifi,(wifi$Room),FUN=function(y){apply(y, 2, A)})
This applies A to all the columns of y
B<- by(wifi,(wifi$Room),FUN=function(y){apply(y[4:9], 2, A)})
This applies A only to columns 4-9 of y, but the total return of B strips off the first 3 columns... I still want those, I just don't want A applied to them.
wifi[,1:3]+B
also does not do what I expected/wanted.
回答1:
Using an example data.frame and example function (just +1 to all values)
A <- function(x) x + 1
wifi <- data.frame(replicate(9,1:4))
wifi
# X1 X2 X3 X4 X5 X6 X7 X8 X9
#1 1 1 1 1 1 1 1 1 1
#2 2 2 2 2 2 2 2 2 2
#3 3 3 3 3 3 3 3 3 3
#4 4 4 4 4 4 4 4 4 4
data.frame(wifi[1:3], apply(wifi[4:9],2, A) )
#or
cbind(wifi[1:3], apply(wifi[4:9],2, A) )
# X1 X2 X3 X4 X5 X6 X7 X8 X9
#1 1 1 1 2 2 2 2 2 2
#2 2 2 2 3 3 3 3 3 3
#3 3 3 3 4 4 4 4 4 4
#4 4 4 4 5 5 5 5 5 5
Or even:
data.frame(wifi[1:3], lapply(wifi[4:9], A) )
#or
cbind(wifi[1:3], lapply(wifi[4:9], A) )
# X1 X2 X3 X4 X5 X6 X7 X8 X9
#1 1 1 1 2 2 2 2 2 2
#2 2 2 2 3 3 3 3 3 3
#3 3 3 3 4 4 4 4 4 4
#4 4 4 4 5 5 5 5 5 5
回答2:
lapply
is probably a better choice than apply
here, as apply first coerces your data.frame to an array which means all the columns must have the same type. Depending on your context, this could have unintended consequences.
The pattern is:
df[cols] <- lapply(df[cols], FUN)
The 'cols' vector can be variable names or indices. I prefer to use names whenever possible (it's robust to column reordering). So in your case this might be:
wifi[4:9] <- lapply(wifi[4:9], A)
An example of using column names:
wifi <- data.frame(A=1:4, B=runif(4), C=5:9)
wifi[c("B", "C")] <- lapply(wifi[c("B", "C")], function(x) -1 * x)
回答3:
I think what you want is mapply. You could apply the function to all columns, and then just drop the columns you don't want. However, if you are applying different functions to different columns, it seems likely what you want is mutate, from the dplyr package.
回答4:
As mentioned, you simply want the standard R apply
function applied to columns (MARGIN=2
):
wifi[,4:9] <- apply(wifi[,4:9], MARGIN=2, FUN=A)
Or, for short:
wifi[,4:9] <- apply(wifi[,4:9], 2, A)
This updates columns 4:9 in-place using the A()
function. Now, let's assume that na.rm
is an argument to A()
, which it probably should be. We can pass na.rm=T
to remove NA values from the computation like so:
wifi[,4:9] <- apply(wifi[,4:9], MARGIN=2, FUN=A, na.rm=T)
The same is true for any other arguments you want to pass to your custom function.