Closest distance between two points(disjoint set)

2019-03-09 04:08发布

问题:

This problem is a kind of closest pair between two disjoint set. Upperside picture is expressed this problem. there is two kind of disjoint set, blue dots in -x plane, red dots in +x plane.

I want to calculate minimum distance(distance is |y2-y1| + |x2 - x1|) between one blue dot and one red dot, and I think use binary search for finding distance. How to use binary search this kind of problem? I struggle on only expressing binary search two disjoint sets. I have already know for one set, but I don't know in case two disjoint sets.

++ ) it is can in linear time using Delaunay triangulation? (ah, it is only my curiosity, I want to use binary search)

below code which I had already coding one set case(using problem-solving technique, divide and qonquer) and coverting to two disjoint set. I don't understand how to do in two sets. Example, Hint. okay.. please someone help me?

#include <iostream>
#include <algorithm>
#include <iomanip>
#include <cmath>

/**
test input
10
-16 -4 
-1 -3 
-9 -1 
-4 -10 
-11 -6 
-20 4 
-13 6 
-3 -10 
-19 -1 
-12 -4
10
8 2
10 3 
10 10 
20 -3 
20 3 
16 2
3 -5 
14 -10
8 -2 
14 0

10
-3 39
-2 -28
-1 20
-3 11
-3 45
-2 -44
-1 -47
-5 -35
-5 -19
-5 -45
10
27 5
28 0
28 5
21 5
2 3
13 -1
16 -2
20 -2
33 -3
27 1
 **/


using namespace std;

const int MAX = 10001;

struct point{
    int x,y;
};

bool xCompare(struct point, struct point);
bool yCompare(struct point, struct point);
int dis(struct point, struct point);

int absd(int);
int trace(int,int,int,int);

point p[MAX], q[MAX], tmp[MAX];

int main(){

    int left;
    int right;

    scanf("%d\n", &left);
    memset(p,0,sizeof(p));
    memset(q,0,sizeof(q));
    memset(tmp,0,sizeof(tmp));


    for(int i=0; i<left; i++){
        cin >> p[i].x >> p[i].y;
    }

    scanf("%d\n", &right);

    for(int j=0; j<right; j++){
        cin >> q[j].x >> q[j].y;
    }

    sort(p, p+left, xCompare);
    sort(q, q+right, xCompare);

    int min = trace(0,0, left-1, right-1);

    printf("%d\n", min);


    /** this is one set case.
    while(true){
        cin >> n;

        if(n == 0)  break;

        memset(p,0,sizeof(p));
        memset(tmp,0,sizeof(tmp));

        for(int i= 0;i<n;i++)
            cin >> p[i].x >> p[i].y;

        sort(p,p+n,xCompare);

        int min = trace(0,n-1);

        if(min < 10000 && n > 1){
            cout << fixed;
            cout << setprecision(4) << min << endl;
        }
        else
            cout << "INFINITY" << endl;
    }
     **/

    return 0;
}

int trace(int low1, int low2, int high1, int high2){

    if(high1 - low1 < 3){
        int value = dis(p[low1],q[low2+1]);
        int nextValue;

        if(high1 - low1 == 2){  
            nextValue = dis(p[low1],q[low2+2]);

            if(value > nextValue)
                value = nextValue;

            nextValue = dis(p[low1+1],q[low2+2]);

            if(value > nextValue)
                value = nextValue;
        }
        return value;
    }
    else{

        /* DIVIDE & QONQUER */

        int mid1 = (low1 + high1) >> 1;
        int mid2 = (low2 + high2) >> 1;
        int cnt = 0;

        int leftValue = trace(low1,low2,mid1,mid2);     // left trace
        int rightValue = trace(mid1+1,mid2+1,high1,high2);  // right trace

        // min value find
        int value = leftValue < rightValue ? leftValue : rightValue;

        /* Middle Condition Check : Y Line */

        // saving left
        for(int i = low1;i<=mid1;i++){
            if(abs(p[i].x - q[mid2].x) <= value)
                tmp[cnt++] = p[i];
        }

        // saving right
        for(int i = mid1+1;i<=high1;i++){
            if(absd(p[i].x - q[mid2+1].x) <= value)
                tmp[cnt++] = p[i];
        }

        sort(tmp,tmp+cnt,yCompare);

        for(int i = 0;i<cnt;i++){
            int count = 0;

            for(int j = i-3;count < 6 && j < cnt;j++){
                if(j >= 0 && i != j){
                    int distance = dis(tmp[i],tmp[j]);

                    if(value > distance)
                        value = distance;

                    count++;
                }
            }
        }
        return value;
    }
}

int absd(int x){
    if( x < 0)
        return -x;
    return x;
}

int dis(struct point a, struct point b){
    return (abs(a.x-b.x) + abs(a.y-b.y));
}

bool xCompare(struct point a, struct point b){
    return a.x < b.x;
}

bool yCompare(struct point a, struct point b){
    return a.y < b.y;
}

回答1:

This problem is usually called the closest bichromatic pair problem. Here are a couple approaches.

  1. Delaunay triangulation. (This certainly works with L2 (= Euclidean) distances; I think the steps generalize to L1.) For every Delaunay triangulation (there can be more than one in degenerate cases), there exists a minimum spanning tree whose edges all belong to the triangulation. In turn, this minimum spanning tree contains a shortest edge crossing the cut between the color classes.

  2. Nearest neighbor data structures.

  3. If it is given that the red points are separated in x from the blue points, then you may be able to adapt the O(n) merge step of the Shamos–Hoey divide-and-conquer algorithm for the closest (monochromatic) pair problem, described here.



回答2:

If you want to do binary search on spatial data, you could use a spatial data structure, such as a quadtree or a k-d tree.



回答3:

here is one more solution. What it basically does is loads both of the set of points into two kd tree instances (which have built mechanism for slicing the tree on x and y axis) and then navigates through both of the trees by checking each node if the distances between the two less than the distance between the prior nodes, if yes then continue until no two nodes can be found with the mutual distance less than any other.

the below code prints the distances found while navigating between the nodes and prints them. The both set of points are also be visualized to see the correctness of the algorithem.

This code can correctly find the nearest nodes regardless whether one set is nested, on right, left, up or downward of the other,

 #include <iostream>

using namespace std;

int const k=2; // the number of dimensions
double min_distance = 10000; // set a large default value, in this example all distance will be shorter than this. 

double distance(int arr[], int arr2[])
{
 return sqrt(pow(arr2[0] - arr[0], 2) + pow(arr2[1] - arr[1], 2));

}

struct Node {
 int point[k];
 Node *left, *right;
 Node()
 {
  left = right = NULL;  

 }
};

// A method to create a node of K D tree
struct Node* newNode(int arr[])
{
 struct Node* temp = new Node;

 for (int i = 0; i<k; i++) temp->point[i] = arr[i];

 return temp;
}

Node * insertNode(Node * node, int arr[], int d)
{
 if (node == NULL)
  return newNode(arr);

 int dim = d%k;


 if (node->point[dim] > arr[dim])
 {


  node->left = insertNode(node->left, arr, dim + 1);
 }
 else
 {

  node->right = insertNode(node->right, arr, dim + 1);
 }

 return node;
}
Node * Nearest=NULL;

Node * FindnearestNode(Node * head1, int arr[], int d)
{
 // if empty tree, return
 if (head1 == NULL)
  return NULL;

 // check for each tree. 
   if (min_distance > distance(head1->point, arr))
 {
  min_distance = distance(head1->point, arr);
  Nearest = head1;
 }

 if (head1->left == NULL && head1->right == NULL)
  return head1;

 // findout current dimension, in this case it either x or y i.e. 0 or 1
 int dim = d%k;

 // navigate through the tree as if inserting to a new member (to remain to the nearest member in closeness). in the path for insert it will find the nearest member. 
 if (head1->right && head1->point[dim] < arr[dim]) return FindnearestNode(head1->right, arr, d+1);
 else if(head1->left && head1->point[dim] > arr[dim] )
  return FindnearestNode(head1->left, arr, d+1);

 return Nearest;
}


int main()
{
 int const an = 10;
 int const bn = 10;

 int ax[an] = { 34,55,11,79,77,65,3,9,5,66 };
 int ay[an] = { 5, 6, 7, 9, 32,3,15,7,10,35 };

 int bx[bn] = { 5,35,4,41,32,64,41,54,87,3 };
 int by[bn] = { 23,33,17,15,32,22,33,23,21,32 };



 Node * head1=NULL;
 Node * head2 = NULL;



 double Final_Min_Distance = min_distance;

 // fill the K-D trees with the two dimensional data in two trees.
 for (int i = 0; i < an; i++)
 {
  int temp[k];
  temp[0] = ax[i];
  temp[1] = ay[i];

  head1=insertNode(head1, temp, 0);
  temp[0] = bx[i];
  temp[1] = by[i];

  head2=insertNode(head2, temp, 0);


 }
 Node * AnearB=NULL;

 Node * BnearA = NULL;



 min_distance = 1000;
   Final_Min_Distance = min_distance;
 for (int i = 0; i < an; i++) { int temp[k]; temp[0] = bx[i]; temp[1] = by[i]; Node * Nearer2 = FindnearestNode(head1, temp, 0); if (Final_Min_Distance > min_distance)
  {
   BnearA = Nearer2;
   Final_Min_Distance = min_distance;
  }
  cout << " distance of B (" << temp[0] << "," << temp[1] << ") to nearest A (" << BnearA->point[0] << "," << BnearA->point[1] << ") distance:" << Final_Min_Distance << endl;
  min_distance = 1000;


 }
 cout << "Minimum Distance is " << Final_Min_Distance<<endl<<endl;


 min_distance = 1000;
 Final_Min_Distance = min_distance;
 for (int i = 0; i < an; i++) { int temp[k]; temp[0] = ax[i]; temp[1] = ay[i]; Node * Nearer2 = FindnearestNode(head2, temp, 0); if (Final_Min_Distance > min_distance)
  {
   AnearB = Nearer2;
   Final_Min_Distance = min_distance;
  }
  cout << " distance of A (" << temp[0] << "," << temp[1] << ") to nearest B (" << AnearB->point[0] << "," << AnearB->point[1] << ") distance:" << Final_Min_Distance << endl;
  min_distance = 1000;


 }
 cout << "Minimum Distance is " << Final_Min_Distance;



 system("pause");

}

https://tahirnaeem.net/finding-closest-pair-of-points-in-two-disjoint-sets-of-points



回答4:

I worked on a similar problem where I had to find a nearest member to identify if a member belong to a cluster within a cluster. I was trying to identify clusters within clusters. Here is the code, This might help you get start.

/**
 * Find the nearest neighbor based on the distance threshold.
 * TODO:
 * @param currentPoint current point in the memory.
 * @param threshold dynamic distance threshold.
 * @return return the neighbor.
 */

private double nearestNeighbor(double currentPoint) {

    HashMap<Double, Double> unsorted = new HashMap<Double, Double>();
    TreeMap<Double, Double> sorted = null; 
    double foundNeighbor = 0.0;

    for (int i = 0; i < bigCluster.length; i++) {
        if (bigCluster[i] != 0.0 && bigCluster[i] != currentPoint) {
            double shortestDistance = Math.abs(currentPoint - bigCluster[i]);
            if (shortestDistance <= this.getDistanceThreshold())
                unsorted.put(shortestDistance, bigCluster[i]);
        }
    }
    if (!unsorted.isEmpty()) {
        sorted = new TreeMap<Double, Double>(unsorted);
        this.setDistanceThreshold(avgDistanceInCluster());
        foundNeighbor = sorted.firstEntry().getValue();
        return foundNeighbor;
    } else {
        return 0.0;
    }
} 


/**
 * Method will check if a point belongs to a cluster based on the dynamic 
 * threshold.
 */
public void isBelongToCluster() {


        for (int i=0; i < tempList.size(); i++) {

            double aPointInCluster = tempList.get(i);

            cluster.add(aPointInCluster);
            double newNeighbor = nearestNeighbor(aPointInCluster);
            if ( newNeighbor != 0.0) {
                cluster.add(newNeighbor);
                if (i + 1 > tempList.size() && (visited[i] != true)) {
                    isBelongToCluster();
                }
            }

        }

    for (int i=0; i < cluster.size(); i++) {
        if (cluster.get(i) != 0.0)
            System.out.println("whats in the cluster -> " + cluster.get(i)); 
    } 
}