I have the following db scheme.
CREATE TABLE Twix.dbo.Sensors
(Id int PRIMARY KEY NOT NULL IDENTITY(1,1),
Location nvarchar(260) NOT NULL);
CREATE TABLE Twix.dbo.Visitors
(Id int PRIMARY KEY NOT NULL IDENTITY(1,1),
Name nvarchar(260) NOT NULL);
CREATE TABLE Twix.dbo.Visits(Id int PRIMARY KEY NOT NULL IDENTITY(1,1),
SensorId int CONSTRAINT FK__Visits__Sensor__239E FOREIGN KEY REFERENCES Sensors(Id),
VisitorId int CONSTRAINT FK__Visits__Visitors__4DCF FOREIGN KEY REFERENCES Visitors(Id),
InTime datetime NOT NULL,
OutTime datetime NOT NULL);
InTime-OutTime periods in Visits table could be intersected. I.e. parts of real visits. For example,
SELECT VisitorId, InTime, OutTime FROM Visits ORDER BY VisitorId, InTime
1 2011-02-09 15:26:59.173 2011-02-09 15:29:22.097
1 2011-02-09 15:28:40.530 2011-02-09 15:29:52.737
1 2011-02-09 15:30:10.577 2011-02-09 16:17:58.967
1 2011-02-09 19:09:23.057 2011-02-09 19:10:57.493
1 2011-02-09 19:12:16.087 2011-02-09 19:13:27.493
1 2011-02-14 15:08:46.333 2011-02-14 15:26:42.433
2 2011-02-09 17:55:42.390 2011-02-09 18:52:03.780
2 2011-02-09 18:10:56.727 2011-02-09 18:11:57.493
2 2011-02-09 21:47:15.650 2011-02-09 21:48:38.783
2 2011-02-09 21:50:18.337 2011-02-09 21:55:26.777
3 2011-02-09 17:12:58.103 2011-02-09 19:51:59.697
3 2011-02-09 22:27:52.073 2011-02-09 23:03:24.753
3 2011-02-09 23:02:51.177 2011-02-10 09:51:14.890
3 2011-02-14 15:27:42.270 2011-02-14 15:42:31.107
3 2011-02-14 15:43:37.320 2011-02-14 18:45:26.163
4 2011-02-09 21:07:51.030 2011-02-09 21:51:02.880
4 2011-02-09 22:42:52.660 2011-02-09 23:21:13.830
4 2011-02-09 23:23:08.563 2011-02-09 23:35:12.847
4 2011-02-09 23:36:05.120 2011-02-09 23:59:02.813
4 2011-02-10 05:58:44.103 2011-02-10 05:59:55.867
4 2011-02-12 08:29:36.620 2011-02-12 09:51:18.510
4 2011-02-12 13:13:42.650 2011-02-12 14:06:01.473
5 2011-02-10 06:48:52.717 2011-02-10 07:37:04.870
5 2011-02-10 06:50:31.067 2011-02-10 06:52:20.877
5 2011-02-10 06:52:36.273 2011-02-10 06:53:36.523
5 2011-02-10 06:59:11.790 2011-02-10 07:00:34.867
5 2011-02-10 08:36:39.563 2011-02-10 08:46:14.760
5 2011-02-10 12:47:05.567 2011-02-10 12:48:05.860
5 2011-02-10 12:49:19.590 2011-02-10 13:09:27.880
5 2011-02-10 12:49:25.733 2011-02-10 12:59:59.883
5 2011-02-10 12:55:23.460 2011-02-10 12:56:23.507
I need to get "completed visits", i.e. merge all visits for concrete vistor, if visits are intersecting, or time difference between them less than 10 min (i.e. "InTime of visit2" - "OutTime of visit1" < 10 min).
In C# it looks like:
private IEnumerable Merge(IEnumerable visits, uint holeInterval)
{
var vlist = new LinkedList<Visit>(visits.OrderBy(o => o.InTime));
var result = new List<Visit>();
while (vlist.Count > 1)
{
Visit a = vlist.First.Value; vlist.RemoveFirst();
Visit b = vlist.First.Value; vlist.RemoveFirst();
var r = Visit.Merge(a, b, holeInterval); // Merges two visits
if (r != null) { vlist.AddFirst(r); }
else { result.Add(a); vlist.AddFirst(b); }
}
result.Add(vlist.First.Value);
return result;
}
public IEnumerable<Visit> ListCompleteVisits()
{
var result = new List<Visit>();
var queryResult = from visits in this.repository.ListVisits()
group visits by visits.Visitor.Id into vgroup
select Merge(vgroup, this.holeInterval);
foreach (var v in queryResult)
{
result.AddRange(v);
}
return result;
}
I tried, and get the following:
DECLARE @holeInterval int
SET @holeInterval = 10
SELECT t.RowNumber, t.VisitorId, t.InTime, t.OutTime, t.BInMinusAOut,
(SELECT MIN(InTime) FROM Visits AS D
WHERE VisitorId NOT IN (select VisitorId from Visits group by VisitorId having COUNT(*) = 1)
AND D.InTime <= t.InTime and D.VisitorId = t.VisitorId
AND t.RowNumber > /*here should be max rownumber greater than holeinterval*/
) AS MinInTime
FROM
(SELECT
ROW_NUMBER() OVER(ORDER BY VisitorId, InTime ASC) AS RowNumber,
VisitorId, InTime, OutTime,
DATEDIFF(MI,InTime,
(SELECT MIN(InTime) FROM Visits AS B
WHERE B.InTime > A.InTime and A.VisitorId = B. VisitorId
)) AS BInMinusAIn,
DATEDIFF(MI,OutTime,
(SELECT MIN(InTime) FROM Visits AS B
WHERE B.InTime > A.InTime and A.VisitorId = B. VisitorId
))) AS BInMinusAOut
FROM Visits AS A
WHERE VisitorId NOT IN (select VisitorId from Visits group by VisitorId having COUNT(*) = 1)
) t
/*WHERE t.BInMinusAOut > @holeInterval OR t.BInMinusAOut IS NULL*/
ORDER BY VisitorId, InTime
1 1 2011-02-09 15:26:59.173 2011-02-09 15:28:22.097 0
2 1 2011-02-09 15:28:40.530 2011-02-09 15:29:52.737 1
3 1 2011-02-09 15:30:10.577 2011-02-09 16:17:58.967 9
4 1 2011-02-09 16:26:44.810 2011-02-09 16:51:46.423 20
5 1 2011-02-09 17:11:57.633 2011-02-09 17:13:20.680 2
6 1 2011-02-09 17:15:35.727 2011-02-09 17:18:48.493 -2
7 1 2011-02-09 17:16:12.230 2011-02-09 17:42:47.867 3
8 1 2011-02-09 17:45:43.793 2011-02-09 17:52:10.860 3
9 1 2011-02-09 17:55:31.127 2011-02-09 20:13:22.743 -109
10 1 2011-02-09 18:24:00.427 2011-02-09 18:32:12.033 2
11 1 2011-02-09 18:34:15.877 2011-02-09 18:37:19.770 2
12 1 2011-02-09 18:39:46.440 2011-02-09 18:48:16.800 2
13 1 2011-02-09 18:50:59.270 2011-02-09 20:03:47.550 -54
14 1 2011-02-09 19:09:23.057 2011-02-09 19:10:57.493 2
15 1 2011-02-09 19:12:16.087 2011-02-09 19:13:27.493 48
Now I need to "merge" 1st-4th into 2011-02-09 15:26:59.173 - 2011-02-09 16:51:46.423 5th-15h into 2011-02-09 17:11:57.633 - 2011-02-09 20:13:22.743 This means, that I have to take min InTime in rows, which is between current and the last, where MinInTime > @holeInterval, and max OutTime for those range too.
I.e. result:
1 2011-02-09 15:26:59.173 2011-02-09 16:51:46.423
1 2011-02-09 17:11:57.633 2011-02-09 20:13:22.743
Thank's.