In my program, I have a set of points. For purposes of rescaling, I am searching for the two nodes that are furthest apart, and then computing a factor by which to multiply all coordinates so that the maximum distance is equal to some predefined one I define.
The algorithm I am using to find the two points furthest apart is, however, problematic for large sets of points, as it is O(n^2)
; pseudocode (distances that were already calculated are skipped):
for each point in points:
for each other point in points:
if distance between point and other point > max
max = distance between point and other point
Is there something faster?
Not really - a common approach is to group points into clusters and then store the distances between clusters.
That way you don't need to check if a certain house in New York is furthest from Paris if you have already determiend that Australia is further away
The following may help to put the average case linear-time algorithms (for the diameter of a finite set) in a clearer light, as well as contrast multi-dimensional and plane geometry problems.
In 1983 Megiddo gave a deterministic linear-time algorithm for the smallest enclosing circle (or sphere in higher dimensions).
In general position the enclosing circle will have two or three points on its boundary, and thus finding the two farthest apart can be done "on average" in constant time once the bounding circle is known. In higher dimensions the number of points in general position needed on the boundary of the sphere increases (D+1 points for dimension D), and indeed the cost of computing distance between a single pair of points rises linearly with dimension.
The subset of points lying on the bounding circle or sphere is also found in linear time. In the theoretical worst-case all points would lie on the bounding circle or sphere, but this at least is stricter than just having all points on the convex hull. If the points on the sphere are independently perturbed, say along radial lines, then general position is assured with probability 1, and an approximate diameter can be found from just D+1 points on the revised enclosing sphere. This randomized approximation has quadratic dependence on dimension but only linear complexity in number of points.
If the points lying on a bounding circle are "sorted" (cyclically, of course), finding the pair farthest apart can be done in linear time, relying upon the "unimodality" of the circle (meaning that distances from a fixed point rise monotonically until the antipode and then fall) to amortize the cost of computing distances. Unfortunately that sorting would introduce a step with O(n log n) time complexity, and this proves to be worst-case optimal for exact deterministic methods in the planar case.
In 2001 Ramos succeeded in showing an O(n log n) deterministic algorithm for three-dimensional sets, but the technique is so involved that an implementation may be impractical or slower than brute force all-pairs search up to very large datasets.
For higher dimensions many authors have considered randomized or approximate algorithms. See Piotr Indyk's thesis (2000) for approximate methods with only polynomial dependence on dimension for various proximity problems.