I tried using the code presented here to find ALL duplicated elements with dplyr like this:
library(dplyr)
mtcars %>%
mutate(cyl.dup = cyl[duplicated(cyl) | duplicated(cyl, from.last = TRUE)])
How can I convert code presented here to find ALL duplicated elements with dplyr? My code above just throws an error? Or even better, is there another function that will achieve this more succinctly than the convoluted x[duplicated(x) | duplicated(x, from.last = TRUE)])
approach?
We can find duplicated elements with dplyr as follows.
The original post contains an error in using the solution from the related answer. In the example given, when you use that solution inside mutate, it tries to subset the cyl vector which will not be of the same length as the mtcars dataframe.
Instead you can use the following example with filter returning all duplicated elements or mutate with ifelse to create a dummy variable which can be filtered upon later:
I guess you could use
filter
for this purpose:Small example (note that I added
summarize()
to prove that the resulting data set does not contain rows with duplicate 'carb'. I used 'carb' instead of 'cyl' because 'carb' has unique values whereas 'cyl' does not):Another solution is to use
janitor
package: