I have a data frame in R which looks like:
| RIC | Date | Open |
|--------|---------------------|--------|
| S1A.PA | 2011-06-30 20:00:00 | 23.7 |
| ABC.PA | 2011-07-03 20:00:00 | 24.31 |
| EFG.PA | 2011-07-04 20:00:00 | 24.495 |
| S1A.PA | 2011-07-05 20:00:00 | 24.23 |
I want to know if there's any duplicates regarding to the combination of RIC and Date. Is there a function for that in R?
If you want to remove duplicate records based on values of Columns Date and State in dataset data.frame:
dplyr is so much nicer for this sort of thing:
(the ".keep_all is optional. if not used, it will return only the deduped 2 columns. when used, it returns the deduped whole data frame)
I think what you're looking for is a way to return a data frame of the duplicated rows in the same format as your original data. There is probably a more elegant way to do this but this works:
You can always try simply passing those first two columns to the function
duplicated
:assuming your data frame is called
dat
. For more information, we can consult the help files for theduplicated
function by typing?duplicated
at the console. This will provide the following sentences:So
duplicated
returns a logical vector, which we can then use to extract a subset ofdat
:or you can skip the separate assignment step and simply use: