Function to automatically create vector in a large

2019-08-23 03:13发布

I have a single Dataframe with the following structure:

A.Data is a vector with numeric data

A.Quartile is a vector with the calculation of quartiles for each A.data and which quartile belongs to this data. (Q1,Q2,Q3,Q4).

I used a very similar code to create the quantile and the Q which belongs to.

quantile(x <- rnorm(1001))
list2env(setNames(as.list(quantile(x <- rnorm(1001))),paste0("Q",1:5)),.GlobalEnv)

enter image description here

Now, ( and here is my problem) I have a .csv that I imported into R, with more than 400 elements with XYZ.Data vectors

So when I imported the .csv file into my environment, I would like to create a function to create in one time all the XYZ.Quartile vectors and I don't know how.

The point would be to read all elements in my list loaded into environment from a .csv file with a function and have the function to create the B.Quartile,C.Quartile,D.Quartile, vectors... one for each element in the list.

Anyone can help please?

Large list with 400 elements

Thank you very much for any comment.

PD: New Code Example

quantile(x <- Orange$circumference)
Orange<- within(Orange, Quartile <- as.integer(cut(Orange$circumference, quantile(Orange$circumference, probs=0:4/4), include.lowest=TRUE)))

1条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-08-23 03:38

Your example data is confusing. It's not clear what the structure of your data is so I'm just pretending your lists are columns of a matrix/data.frame.

# proper example data
set.seed(1)
dat <- replicate(6, rnorm(20))
colnames(dat) <- LETTERS[1:6]
head(dat)
#              A           B          C           D          E           F
#[1,] -0.6264538  0.91897737 -0.1645236  2.40161776 -0.5686687 -0.62036668
#[2,]  0.1836433  0.78213630 -0.2533617 -0.03924000 -0.1351786  0.04211587
#[3,] -0.8356286  0.07456498  0.6969634  0.68973936  1.1780870 -0.91092165
#[4,]  1.5952808 -1.98935170  0.5566632  0.02800216 -1.5235668  0.15802877
#[5,]  0.3295078  0.61982575 -0.6887557 -0.74327321  0.5939462 -0.65458464
#[6,] -0.8204684 -0.05612874 -0.7074952  0.18879230  0.3329504  1.76728727

# for each column i
qdat <- apply(dat, 2, function(i){
  q <- quantile(i)
  # for each element j in column i
  sapply(i, function(j){
    paste0("Q",1:5)[sum(j > q)+1]
  })
})
head(qdat)
#     A    B    C    D    E    F   
#[1,] "Q2" "Q5" "Q3" "Q5" "Q2" "Q2"
#[2,] "Q3" "Q5" "Q3" "Q3" "Q3" "Q4"
#[3,] "Q2" "Q4" "Q5" "Q5" "Q5" "Q1"
#[4,] "Q5" "Q1" "Q4" "Q3" "Q1" "Q4"
#[5,] "Q3" "Q4" "Q2" "Q2" "Q4" "Q2"
#[6,] "Q2" "Q3" "Q2" "Q4" "Q4" "Q5"

EDIT 1 See the following code:

# example data
set.seed(1)
dat <- replicate(3, rnorm(20))
colnames(dat) <- paste0(LETTERS[1:3],".Data")

replacewithQ <- function(x) {
  as.integer(cut(x, 
                 quantile(x, 
                          probs=0:4/4), 
                 include.lowest=TRUE)
  )
}

qdat <- apply(dat, 2, replacewithQ)
colnames(qdat) <- gsub("Data","Quartile",colnames(dat))
newdat <- cbind(dat, qdat)
head(newdat)
#         A.Data      B.Data     C.Data A.Quartile B.Quartile C.Quartile
#[1,] -0.6264538  0.91897737 -0.1645236          1          4          2
#[2,]  0.1836433  0.78213630 -0.2533617          2          4          2
#[3,] -0.8356286  0.07456498  0.6969634          1          3          4
#[4,]  1.5952808 -1.98935170  0.5566632          4          1          3
#[5,]  0.3295078  0.61982575 -0.6887557          2          3          1
#[6,] -0.8204684 -0.05612874 -0.7074952          1          2          1
查看更多
登录 后发表回答