this seems to be rather easy, but it keeps my busy since a while.
I have a dataframe (df) with n columns and a vector with the same number (n) of values.
The values in the vector are thresholds for the observations in the columns in the dataframe. So the clue is, how to tell R to use different thresholds for each column?
I want to keep all the observations in the dataframe which fulfill the various thresholds for each column (above or below, doesnt matter in the example). The observations which do not fulfill the threshold criterion should be set to 0.
I dont want a subset of the dataframe.
Can anyone help? Thanks a lot in advance.
Given some example data and thresholds
we can use the
mapply()
function to work out which observations in each column (in this) are greater than or equal to the threshold. Using those indices, we can replace the values corresponding to the indices with0
via:Here is the call in action:
It is instructive to notice what
mapply()
returns in this case:and it is those logical values that are used to select the observations that meet the threshold. You can a different binary operator to the one I used; see
?">"
for the various options. When writing themapply()
call, think of it in terms of left-hand-side and right-hand-side of the binary operator, such that anmapply()
call would give:where we might write
Update: As @DWin has answered the comment about two thresholds I will update my Answer to match.
We can see which elements match both constraints:
and the same construct can be used to select those elements that match:
Not sure how it's going to work with data frames, but the following worked with matrices:
You can get a boolean representation of
df
under the given condition and then use it as indexing ofdf
to set the values. Alternatively you can get a vector with indexes of the matching fields and use it as index vector to set the values. Hope that helps.I like Gavin's answer better than mine, but here's a slightly different application of
mapply
using his data:In light of your second comment: my construction might be more generalizable than Gavin's
Two threshold vectors: