Subset row in dataframe by values

2019-08-14 12:26发布

 STEST                           TEST  CRESULTC CRESULTS CUNIT SIRESULC SIRESULT SIUNIT VWEEK TYPE    WKSLAB SILO SIHI CNLO
1   TALT                     ALT (SGPT)               85.0  IU/L                85   IU/L    -1    1 -2.142857    0   55  0.0
2   TAST                     AST (SGOT)               74.0  IU/L                74   IU/L    -1    1 -2.142857    0   40  0.0
3   TALB                        Albumin                4.3  g/dL                43    g/L    -1    1 -2.142857   36   48  3.6
4   TALP           Alkaline Phosphatase               45.0  IU/L                45   IU/L    -1    1 -2.142857   25  160 25.0
5   AMMB               Ammonium Biurate None Seen      NaN       NoneSeen      NaN           -1    1 -2.142857  NaN  NaN  NaN
6 AMURPH Amorphous Urates or Phosphates None Seen      NaN       NoneSeen      NaN           -1    1 -2.142857  NaN  NaN  NaN

Let's say that I have this dataframe, and its named labs. I want to subset it by multiple row values. For example, I need to extract only the rows where the value TEST is equal to Albumin or Ammonium Biurate.

D1 = subset(labs, labs$TEST == 'Albumin' & labs$TEST == 'Ammonium Biurate')

Yet after running this code, I get a dataframe with 0 objects? How do I subset by multiple row conditions in R properly?

D1 = subset(labs, labs$TEST == 'Ammonium Biurate' | labs$TEST == 'Albumin')

D1 = subset(labs, labs$TEST %in% c('Ammonium Biurate, Albumin)

Edit : Thanks for the suggestion with %in%. The use of == will search for patterns in the TEST column which follow the vector only.

标签: r
1条回答
老娘就宠你
2楼-- · 2019-08-14 12:50

As akrun alluded to above, your subset statement does not match the criterion you mention.

Instead of writing

D1 = subset(labs, labs$TEST == 'Albumin' & labs$TEST == 'Ammonium Biurate')

write

D1 = subset(labs, labs$TEST == 'Albumin' | labs$TEST == 'Ammonium Biurate')

Your version is a logical AND, which is never true in your case, since the test value is never both at the same time. A logical OR is closer to what you where looking for, i.e. it's either Albumin or Ammonium Biurate, not both.

PS: Try to provide an easily reproducible example next time. It's simpler to immediately test an idea on your problem then.

查看更多
登录 后发表回答