Using substring on a column in R

2020-02-12 17:26发布

How would I use substring to only use the first 3 digits of the postal code in the data sheet?

YEAR    PERSON    POSTALCODE   STORE_ID
2012    245345    M2H 2I4       20001319
2012    234324    L6N 3R5       20001319
2012    556464    L6N 4T5       20001319

This is a piece of code I tried, however my data sheet appeared with 0 objects after I added the substring part of the code (I'm guessing I made an extremely dumb mistake):

combined <- merge(df1, df2, by.y="PERSON")
store1  <- combined[combined$STORE_ID == 20001319 && substr(combined$POSTALCODE, 1, 3), ]  

标签: r substring
1条回答
We Are One
2楼-- · 2020-02-12 17:56

substr(combined$POSTALCODE, 1, 3) gives you

# [1] "M2H" "L6N" "L6N"

So one possible selection could be

combined[combined$STORE_ID == 20001319 & substr(combined$POSTALCODE, 1, 3) == "M2H", ] which gives you the subset

#   YEAR PERSON POSTALCODE STORE_ID
# 1 2012 245345    M2H 2I4 20001319
查看更多
登录 后发表回答