frame with 10 rows and 3 columns
a b c
1 1 201 1
2 2 202 1
3 3 203 1
4 4 204 1
5 5 205 4
6 6 206 5
7 7 207 4
8 8 208 4
9 9 209 8
10 10 210 5
I want to delete all rows where the same value in the column "c" repeated less than 3 times. In this example I want to remove rows 6, 9 and 10. (my real data.frame has 5000 rows and 25 cols) I tried to do it using the function rle, but I keep getting the wrong solution. any help? thanks!
Building on Joshua's answer:
Here is a solution using
ave
:or using
ave
withsubset
:Correct me if I'm wrong, but it seems like you want all the rows where the value in column c occurs more than twice. "Repeated" makes me think that they need to occur consecutively, which is what
rle
is for, but you would only want rows 1-4 if that was what you were trying to do.That said, the code below finds the rows where the value in column c occurs more than 2 times. I'm sure this can be done more elegantly, but it works.
Using unsplit is probably the easiest way to project a grouped aggregate (in this case using table to get counts, but see tapply for the general case) out to the original data.
Equivalently and more similar to Erik's: