I have a weird result for my data.table
v1.9.2 :
DT
timestamp
1: 2013-01-01 17:51:00.707
2: 2013-01-01 17:51:59.996
3: 2013-01-01 17:52:00.059
4: 2013-01-01 17:54:23.901
5: 2013-01-01 17:54:23.914
str(DT)
Classes ‘data.table’ and 'data.frame': 5 obs. of 1 variable:
$ timestamp: POSIXct, format: "2013-01-01 17:51:00.707" "2013-01-01 17:51:59.996" "2013-01-01 17:52:00.059" "2013-01-01 17:54:23.901" ...
- attr(*, "sorted")= chr "timestamp"
- attr(*, ".internal.selfref")=<externalptr>
When I apply the duplicated()
function I get the following result:
duplicated(DT)
[1] FALSE FALSE FALSE FALSE TRUE
It is weird to get the 5th line equal to the 4th. This also blocks me from joining tables in R. Does is have something to do with POSIXct type?
DT on skydrive : DT
Thanks.
Yes I reproduced your result with v1.9.2.
Update from v1.9.3 from Matt
There was a change to rounding in v1.9.2 which affected milliseconds of POSIXct. More info here :
Grouping very small numbers (e.g. 1e-28) and 0.0 in data.table v1.8.10 vs v1.9.2
Large integers in data.table. Grouping results different in 1.9.2 compared to 1.8.10
So, the workaround now available in v1.9.3 is :
Hope you understand why the change was made and agree that we're going in the right direction.
Of course, you shouldn't have to call
setNumericRounding()
, that's just a workaround.I've filed a new item on the tracker :
#5445 numeric rounding should be 0 or 1 automatically for POSIXct