Revert transformation preprocess caret

2020-06-25 18:57发布

I transformed data to attend to the requirements of a linear model (normally distributed):

d.reg1 = d.reg %>% preProcess("YeoJohnson") %>% predict(d.reg) 

The adjusted model:

fit = lm(log10(Qmld)~log10(Peq750), data = d.reg1) #potential regression

Predicted data:

a=10^fit$coefficients[1]
b=fit$coefficients[2]

d.reg1$Qmld_predita=a*d.reg1$Peq750^b 

How could I untransform d.reg1$Qmld_predita, since the model was fitted to transformed data and this has no physical significance for me?

2条回答
家丑人穷心不美
2楼-- · 2020-06-25 19:28

Here is another addition, if you are scaling to 0-1 you can use this to inverse transform it. Useful for deep learning

revPredict <- function(preproc, data,digits=0,range = F) {
   if (range == T){
     data<-data %>%
       select(one_of(dimnames(preproc$ranges)[[2]])) %>%
       map2_df(preproc$ranges[2,]-preproc$ranges[1,], ., function(min_max, dat) min_max* dat)  %>%
       map2_df(preproc$ranges[1,], ., function(min, dat) min + dat) %>%
      mutate_if(is.numeric,funs(round(.,digits = digits)))
    return(data)
    }
  data<- data %>%
    select(one_of(names(preproc$mean))) %>%
    map2_df(preproc$std, ., function(sig, dat) dat * sig)  %>%
    map2_df(preproc$mean, ., function(mu, dat) dat + mu) %>%
    mutate_if(is.numeric,funs(round(.,digits = digits)))
  return(data)
}
查看更多
够拽才男人
3楼-- · 2020-06-25 19:47

Here's a model for a function that could be modified based on the initial transformations chosen (e.g. here the initial transformations were c("scale", "center").

library(tidyverse)

revPredict <- function(preproc, data, digits=0) {
  data %>%
    select(one_of(preproc$mean %>% names)) %>%
    map2_df(preproc$std, ., function(sig, dat) dat * sig) %>%
    map2_df(preproc$mean, ., function(mu, dat) dat + mu)
}

revPredict(preprocess_params, df_needing_reverse_transformation)

Since it's been more than 6 months since the question was asked, I assume you've figured a way around this, but it may still be of interest given the similar question being here, too.


To round values, pipe the output of the second map2_df to this:

    mutate_if(is.numeric,funs(round(.,digits = digits)))
查看更多
登录 后发表回答