Algorithm to cover maximal number of points with o

2020-02-16 11:55发布

Let's imagine we have a plane with some points on it. We also have a circle of given radius.

I need an algorithm that determines such position of the circle that it covers maximal possible number of points. Of course, there are many such positions, so the algorithm should return one of them.
Precision is not important and the algorithm may do small mistakes.

Here is an example picture:
Example

Input:
  int n (n<=50) – number of points;
  float x[n] and float y[n] – arrays with points' X and Y coordinates;
  float r – radius of the circle.

Output:
  float cx and float cy – coordinates of the circle's center

Lightning speed of the algorithm is not required, but it shouldn't be too slow (because I know a few slow solutions for this situation).

C++ code is preferred, but not obligatory.

10条回答
聊天终结者
2楼-- · 2020-02-16 12:28

Might I suggest a density map? Find the min and max bounds for the x and y. Divide the range of the x and y bounds into bins having widths equal to the diameter of your circle. Count the number of points in each bin for the x and y separately. Now find the intersection on your density map between the highest ranked x bin with the highest ranked y bin.

This is a VERY fast algorithm to quickly generalize large data sets but it is not always accurate, to improve accuracy you can slice the bins into smaller and smaller pieces or shift the bin positions to the left or right n times and use a voting system to select the answer that occurs most often between trials.

查看更多
Summer. ? 凉城
3楼-- · 2020-02-16 12:30

The problem reverts back to finding the global optimum of a function f :R x R -> N. The input for f is the center point of the circle, and the value, of course, is the number of included points from the set. The graph of the function will be discontinuous, staircase-like.

You could start by testing the function in each point corresponding to a point in the set, order the points by decreasing values of f, then intensify search around those points (for example moving out along a spiral).

Another option is to consider all intersection points of segments connecting any pairs of points in the set. Your optimal point will be at one of these intersections I think, but their number is probably too big to consider.

You may also hybridise options 1 and 2, and consider intersections of the segments generated from points around 'good set points'.

Anyhow, unless the number of set points is low (allowing you to check all intersections), I don't think you can guarantee to find the optimum (just a good solution).

查看更多
劳资没心,怎么记你
4楼-- · 2020-02-16 12:32

At a first glance, I would say a quad tree solution.

Also, there is an information visualization/Data mining method called K-means which makes clusters of given data. It can be used with added functionality in the end to fit your purpose.

The basic algorithm for K-Means is:

  1. Place K points into the space represented by the items that are being clustered - These points represent initial group centroids
  2. Assign each data item to the group that has the closest centroid
  3. When all items have been assigned, recalculate the positions of the K centroids by calculating the mean position of your dots
  4. Repeat Steps 2 and 3 until the centroids no longer move, or move very little

The added functionality would then be:

  1. Calculate number of points within your circle positioned at the K centroids
  2. Choose the one that suits you the best ;)

Sources:
K-means algorithm - Linköping University
Quadtree - en.wikipedia.org/wiki/Quadtree

A quick search on wikipedia and you find source code: en.wikipedia.org/wiki/K-means_clustering

查看更多
够拽才男人
5楼-- · 2020-02-16 12:33

If it is true that precision is not important and algorithm may do small mistakes then I think the following.

Let f(x,y) is a function which has a maximum at the point (0,0) and is only significant at the points inside of circle of radius R. For example, f(x,y) = e^{(x^2 + y^2)/ (2 * R^2)}.

Let (x_i,y_i) are points and E_i(x,y) = f(x - x_i, y - y_i).

Your problem is to find the maximum of \sum_i E_i(x,y) alt text.

You can use a gradient descent starting it from each point.

查看更多
来,给爷笑一个
6楼-- · 2020-02-16 12:39

Edited to better wording, as suggested :

Basic observations :

  • I assume the radius is one, since it doesn't change anything.
  • given any two points, there exists at most two unit circles on which they lie.
  • given a solution circle to your problem, you can move it until it contains two points of your set while keeping the same number of points of your set inside it.

The algorithm is then:

  • For each pair of points, if their distance is < 2, compute the two unit circles C1 and C2 that pass through them.
  • Compute the number of points of your set inside C1 and C2
  • Take the max.
查看更多
成全新的幸福
7楼-- · 2020-02-16 12:41

Here are two ideas: an O(n) approximation algorithm and an O(n^2 log n) exact (non-approximate) algorithm:

Fast approximation

Use locality-sensitive hashing. Basically, hash each point to a hash bucket that contains all nearby points. The buckets are set up so that collisions only happen between nearby points - unlike similarly-named hash tables, collisions are useful here. Keep a running count of the number of collisions in a bucket, and then use the center of that bucket as the center of your circle.

I admit that this is a fast explanation of a concept that is not super-obvious the first time you hear of it. An analogy would be to ask a bunch of people what their zip code is, and use the most common zip code to determine the most populous circle. It's not perfect, but a good fast heuristic.

It's basically linear time in terms of the number of points, and you can update your data set on the fly to incrementally get new answers in constant time per point (constant with respect to n = number of points).

More about locality-sensitive hashes in general or about my personal favorite version that would work in this case.

Better-than-brute-force deterministic approach

The idea is: for each point, place the edge of a circle on that point and sweep the circle around to see which direction contains the most points. Do this for all points and we find the global max.

The sweep around a point p can be done in n log n time by: (a) finding the angle interval per other point q such that when we place the circle at angle theta, then it contains q; and (b) sorting the intervals so that we can march/sweep around p in linear time.

So it takes O(n log n) time to find the best circle touching a fixed point p, and then multiply that by O(n) to perform the check for all points. The total time is O(n^2 log n).

查看更多
登录 后发表回答