Kindly I need your support to advice a library or a code in vb.net or C#.net that applies the DBSCAN to make Denisty Based Cluster of data . I have a GPS data , and I want to find stay points using the DBSCAN algorithm . But , I do not understand much of the technical part of the algorithm.
问题:
回答1:
Not sure that's what you're looking for because the algorithm is very well explain on wikipedia. Do you want an explaination of the algorithm or a translation(or good library) of it in C# ?
You can have a look at general clustering algorithm too.
Algorithm
Let say you chose epsilon and the number of element to start a cluster is 4.
You need to define a distance function, a DBSCAN function and an expand cluster function:
from wikipedia:
DBSCAN(D, eps, MinPts)
C = 0
for each unvisited point P in dataset D
mark P as visited
N = getNeighbors (P, eps)
if sizeof(N) < MinPts
mark P as NOISE
else
C = next cluster
expandCluster(P, N, C, eps, MinPts)
expandCluster(P, N, C, eps, MinPts)
add P to cluster C
for each point P' in N
if P' is not visited
mark P' as visited
N' = getNeighbors(P', eps)
if sizeof(N') >= MinPts
N = N joined with N'
if P' is not yet member of any cluster
add P' to cluster C
You have a list of points:
First: select a point randomly :
Test in epsilon (Epsilon is the radius of the circles) if the number of point is 4. If yes start a cluster (green) otherwise mark as noise (red):(fonction DBSCAN for each unvisited point) The arrows show all the points you visited
secondly: Expand cluster : once you find a cluster mark all the point green and check for more points in this cluster
NOTE: a formerly noise point can be changed to green if in a cluster
the 2 red point are actually in a cluster ...
Once you went through all the points you stop
回答2:
OPTICS
, an extension of DBSCAN
that does away with the sometimes hard to choose epsilon parameter (but that may actually be rather easy for you, as you have geo data - just set it to 1 km or whatever you consider sensible).
It's a fairly nice and powerful extension of DBSCAN
, but unfortunately also a bit harder to implement.