DBSCAN code in C# or vb.net , for Cluster Analysis

2020-05-28 18:34发布

问题:

Kindly I need your support to advice a library or a code in vb.net or C#.net that applies the DBSCAN to make Denisty Based Cluster of data . I have a GPS data , and I want to find stay points using the DBSCAN algorithm . But , I do not understand much of the technical part of the algorithm.

回答1:

Not sure that's what you're looking for because the algorithm is very well explain on wikipedia. Do you want an explaination of the algorithm or a translation(or good library) of it in C# ?

You can have a look at general clustering algorithm too.

Algorithm

Let say you chose epsilon and the number of element to start a cluster is 4.

You need to define a distance function, a DBSCAN function and an expand cluster function:

from wikipedia:

DBSCAN(D, eps, MinPts)
   C = 0
   for each unvisited point P in dataset D
      mark P as visited
      N = getNeighbors (P, eps)
      if sizeof(N) < MinPts
         mark P as NOISE
      else
         C = next cluster
         expandCluster(P, N, C, eps, MinPts)

expandCluster(P, N, C, eps, MinPts)
   add P to cluster C
   for each point P' in N 
      if P' is not visited
         mark P' as visited
         N' = getNeighbors(P', eps)
         if sizeof(N') >= MinPts
            N = N joined with N'
      if P' is not yet member of any cluster
         add P' to cluster C

You have a list of points:

First: select a point randomly :

Test in epsilon (Epsilon is the radius of the circles) if the number of point is 4. If yes start a cluster (green) otherwise mark as noise (red):(fonction DBSCAN for each unvisited point) The arrows show all the points you visited

secondly: Expand cluster : once you find a cluster mark all the point green and check for more points in this cluster

NOTE: a formerly noise point can be changed to green if in a cluster

the 2 red point are actually in a cluster ...

Once you went through all the points you stop



回答2:

OPTICS, an extension of DBSCAN that does away with the sometimes hard to choose epsilon parameter (but that may actually be rather easy for you, as you have geo data - just set it to 1 km or whatever you consider sensible).

It's a fairly nice and powerful extension of DBSCAN, but unfortunately also a bit harder to implement.