R creating a new variable by matching a variable i

2020-03-31 08:47发布

I have one variable I am trying to pare down to a more manageable length of values. I exported a list of the variable's unique values into a csv file, and assigned them more general names in an adjacent column. E.g.,

EVTYPE  new_category

- x1    x
- x2    x
- x3    x
- x4    x
- y1    y
- y2    y
- y3    y

I then uploaded this back into R, and am trying to create a new variable, where if old_val = x1, new_var2 =x , and so on. There are about 1,000 unique values in the old_val variable, so nesting ifelse statements or something similar isnt really possible. Here is some code I am working on, but cannot get to work yet, where dataset = the overall dataset and new_data = the dataset with the unique values: (Sorry for the poor formatting, not sure how to do that correctly for the above list)

ND_row_count <- NROW(new_data)
for (i in 1:ND_row_count){
  if (dataset$EVTYPE==new_data$EVTYPE2[i]) {
    dataset$new_category <- new_data$new_category[i]
    }
}

标签: r
1条回答
We Are One
2楼-- · 2020-03-31 09:39

You can use the vectorised function, match, for this.

The following should return (and assign to dataset$new_category) a vector of new categories corresponding to your long vector of original values.

dataset$new_category <- new_data$new_category[match(dataset$EVTYPE, new_data$EVTYPE2)]

Above, match finds, for each element of dataset$EVTYPE the position of the matching element of new_data$EVTYPE2. We then use that vector of indices to subset new_data$new_category.

查看更多
登录 后发表回答