Rectangles Covering

2楼-- · 2019-01-22 13:22

Okay, now it seems I can't even sleep at night because I think about this problem... but it also seems I finally got an O(n log n) solution, in average case (but with reduced chances of having a pathological input compared to @lVlad).

We all know the Divide and Conquer technic. To ensure O(n log n) using it, we usually focus on 2 points:

the divide and merge should be O(n)
there exist C > 1 such that at each step the size of the subproblems is reduced by a factor C (constant throughout the problem)

With these constraints in mind I have elaborated the following algorithm, which is reminiscent of qsort, and thus suffer the same pitfalls (namely, fractal inputs).

Algorithm

Clipping: we only consider the portion of a red that intersect with blue, they are inserted in a HashSet to remove duplicates --> O(n)
Pivot Selection: more on this later --> O(n)
Partition: once we have a pivot, we subdivise the space in 3^d zones, one of which being the pivot, with d being the dimension (d = 1 for segments, 2 for rectangles, 3 for cubes etc...)
Repartition: we put the red in the partitions, applying the Clipping technic, note that a given red might end up giving several chunks in different partitions
Recursion: we apply recursively (from step 2) on each partition, beginning by the least populated one and stopping as soon as one is not covered

The Pivot Choice is the corner stone of the algorithm, if the partition is ill-tailored we cannot achieve the required complexity. Also, it must be accomplished in O(n). I have 2 proposals so far:

Maximum Area: use the rectangle with the greatest area, because it means that the partitions will have the smallest area afterward, however it suffers from being easy to trump
Median of 3: based on qsort, pick up 3 elements, selection the median (the one with the center closer to the barycenter of the 3 centers)

I propose to mix them up thusly:

Pick up the 3 elements with the greatest area, select the median, use it at pivot
If after the repartition it turns out that one of the partition is populated with more than N/C (to be customized) elements, pick up 3 elements at random, select the median, and do the repartition (no check here)

Another aspect of implementation is the Tail of the recursion. Like qsort it's probable that the algorithm is inefficient for small n. Instead of going all the way to 1, I propose to use the introsort trick: if n is smaller than say 12, then use the following algorithm instead:

For each axis, project each red on the axis (only the edges) and sort them (ala introsort)
This defines a raster of n^d zones, check that each is covered

Dimension agnostic

The algorithm is formally defined to be applicable in any given dimension with the same asymptotic complexity, in average O(n log n). This gives us the opportunity to test in dimension 1 to identify the pathological inputs.

Pathological input

Like qsort on which it is modelled it is sensible to factorial inputs. By factorial inputs I mean:

1.......6...9.11.13

whenever you pick the average of your interval, you have all the elements on one side of it.

With such an input even choosing the median of 3 is unlikely to yield a very good cut.

EDIT:

I am going to show the partition idea with a little scheme, as @lVlad noted it was kind of fuzzy.

+----------------+----+---------+
|       1        | 2  |   3     |
+----------------+----+---------+
|       8        | P  |   4     |
+----------------+----+---------+
|       7        | 6  |   5     |
+----------------+----+---------+

Okay, so the rectangle we check for coverage is now splitted into 9 subrectangles. We ignore P, it's our pivot.

Now, we would really like that each subrectangle is covered by less red than N. The pivot is chosen as a barycenter of the centers, thus it means if our "random" choice held true that there are about as many reds centers in each direction of the pivot.

It's kind of fuzzy there because some special configurations might make it so that there is little gain at one step (all rectangles have the same center and we just picked the smaller one for example), but it will create chaos and thus the following step will be different.

I am happy if anyone can formalize that, I am an engineer, not a computer scientist, and my maths lag behind...

0人赞添加讨论(0) 举报

Luminary・发光体

3楼-- · 2019-01-22 13:22

Do you know the usual worst-case O(nlogn) algorithm for the area of the union of rectangles?

All you need to do here is to compute the two areas:

The area of your N rectangles
The area of your N rectangles and the model

If these areas are equal, the model is totally covered, otherwise it isn't.

0人赞添加讨论(0) 举报

Fickle 薄情

4楼-- · 2019-01-22 13:26

I've been thinking about it and I think I finally understood what @algorithmist meant by sweep line. However the very presence of sorting operations means that I have:

O(n log n) in average
O(n**2) in the worst case

Sweep Line

First of all, we need some sorting, because our sweep line will consist of walking an ordered set.

This ordered set will feature the top and bottom line of each of the reds, as long as they are between the top and bottom of blue. This divides our space into (at most) n*2 horizontal strips.

Now, we need to make sure that in each strip, the whole of blue is covered, and this operation cannot have more than O(log n) complexity, this could be done if we had (for each strip) a list of the covered intervals. This is the 'event' @algorithmist is speaking of

To handle this event, we'll use a binary tree described below which handles adding or removing a rectangle in exactly O(log n) operations and yields the rightmost interval covered by the tree, which we use to tell if the strip of blue is covered or not.

Binary Tree

First of all, I am going to index the red rectangles. We sort them using this function:

def __lt__(lhs, rhs):
  return (lhs.left <  rhs.left)
      or (lhs.left == rhs.left and lhs.right < rhs.right)

I am going then to create a dedicated binary tree.

It will have N leaves, each representing a red rectangle and indicating its presence or absence. They are ordered according to the index.
Each intermediary node will have for value the rightmost interval covered by its children

Handling the bug "code block cannot follow list":

class Node:
  def __init__(self):
    self.interval = []
    self.left  = None
    self.right = None

Now we have two possibilities, let's say the children cover [a,b] and [c,d]:

if c <= b, then the node hold [a,d]
else it holds [c,d]

Why does it works ? Let's take an example using 4 leaves:

        _ [1,9] _
       /         \
   [1,7]         [6,9] <-- Special node merge
   /   \         /   \
  /     \       /     \
[1,3] [2,7]   [3,5] [6,9]

The special node ignore [3,5] because it's not the rightmost interval. The reasoning is that the rectangles are ordered:

No rectangle on the right of [6,9] will cover the missing [5,6] interval since they begin after 6
The rectangles on the left of [3,5] begin before 3, so if they cover the missing [5,6] they'll cover [3,5] anyway

The root may not indicate the exact set of intervals covered: only the rightmost interval covered. However, it's perfectly sufficient for us to tell if blue is completely covered or not!

There are 2 operations available on this tree:

Marking a rectangle as present
Marking a rectangle as absent

Each is similar:

if the rectangle was already in this state, do nothing
else, toggle its state, then update its parent interval (recursively, up to the root)

The recursive bit takes O(log n). It's a classic property of the balanced binary trees. And once it's done we have the rightmost interval covered by the root which is sufficient to tell whether or not the blue segment is entirely covered or not.

Complexity

The complexity of the algorithm is simple:

We have O(n) events
Each event is handled in O(log n)

Which yields O(n log n) for the core part.

However, we should not forget that we also have 2 sort operations:

one to classify the events (for the sweep line)
the other to classify the rectangles (for the binary tree)

Each shall take O(n log n) in average, but may degenerate into O(n**2) in the worst case, depending on the sorting algorithm used.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

5楼-- · 2019-01-22 13:26

Here's a way to do this without using rasterization, that is, using only pure numbers.

Note: This is not O(n log n), more like O(n^2). It is, however, a solution. Whether it is an answer, probably not if O(n log n) is a requirement.

Create a list of all the unique X-coordinates of the left and right edges (in the same list)
When you build this list, for each coordinate, also associate a list of rectangles with it, and denote in this list whether the X-coordinate the list is associated with is the left or the right edge of the rectangle
Sort the list of coordinates so that it is ascending, going from left to right
Create a new list of rectangles, initially empty
Run through the list of coordinates, and process the associated list of rectangles for it
All rectangles in the associated list that are denoted as having the coordinate as the left edge should be added to the list you created in 4.
All rectangles with the coordinate as the right edge should be removed.
The order of adding and removing should actually be done in the opposite order, first you want to remove the right-edge-rectangles, and then you add all the left-edge-rectangles
Now, before you remove the rectangles from the list you created in 4, you want to process them. You process them by treating them as "sub-rectangles". Their "new" left edge coordinate is the X-coordinate you processed the previous iteration of the loop (in 5), and the new right edge is the current X-coordinate being processed
The output of the algorithm is a collection of coordinate-pairs (the two X-coordinates left and right), each pair having an associated list of rectangles (the vertical strip)

The output should thus be:

X=1..2, Rects=A
X=2..3, Rects=A, B
X=3..4, Rects=A, B, C
X=4..5, Rects=A, C
X=5..6, Rects=C

Let me illustrate the process so far

+-------------------+
|A                  |
|        +----------+-----+
|        |C         |     |
|  +-----+----+     |     |
|  |B    |    |     |     |
|  |     +----+-----+-----+
|  |          |     |
+--+----------+-----+
   |          |
   +----------+

^  ^     ^    ^     ^     ^
1  2     3    4     5     6  <-- X-coordinates

Associated rectangles:

left: A, right: (none)
left: B, right: (none)
left: C, right: (none)
left: (none), right: B
left: (none), right: A
left: (none), right: C

You now create an empty list, L=[], and start processing the coordinates and associated rectangles:

X=1

List is empty, nothing to process Nothing to remove Add A: L=[ A ]

X=2

List contains rectangles, process list as rectangles that have a left edge of X=1, and a right edge of X=2 (the two coordinates we've processed so far), and use their original top and bottom coordinates. Nothing to remove. Add B: L=[ A, B ]

X=3

List contains rectangles, process list (both A and B) the same way, treat them as temporarily having left and right coordinates as X=2 and X=3, and use their original top and bottom coordinates. Nothing to remove Add C: L=[ A, B, C ]

X=4

Process the three rectangles the same way as above, temporary left and right coordinates are X=3 and X=4 Remove B: L=[A, C ] Nothing to add

X=5 and X=6

Process these in the exact same manner.

This means you will end up with "strips" of rectangles, like this (I've pulled them a bit apart to clearer illustrate that they are strips, but they are located side-by-side continously like in the original diagram):

+--+ +-----+ +----+ ------+
|A | |     | |    | |     |
|  | |     | +----+ +-----+ +-----+
|  | |     | |C   | |     | |     |
|  | +-----| +----+ |     | |     |
|  | |B    | |    | |     | |     |
|  | |     | +----+ +-----| +-----+
|  | |     | |    | |     |
+--+ +-----+ +----+ +-----+
     |     | |    |
     +-----+ +----+
^  ^ ^     ^ ^    ^ ^     ^ ^     ^
1  2 2     3 3    4 4     5 5     6

Ok, so now you have your output, a collection of coordinate-pairs, each pair having an associated list of rectangles.

Now we do a trick. We process the vertical strip in the exact same manner, only this time we use the Y coordinates as the delimiters. Let's handle the strip between 3 and 4 above. Remember that the strip has a left and right coordinates of 3 and 4.

Strip looks like this:

^     +----+ <-- 1
A     |    |
|   ^ +----+ <-- 2
|   C |    |
| ^ | +----+ <-- 3
| B | |    |
| | V +----+ <-- 4
| |   |    |
V |   +----+ <-- 5
  |   |    |
  V   +----+ <-- 6

List of coordinates with rectangles:

Top=A, Bottom=(none)
Top=C, Bottom=(none)
Top=B, Bottom=(none)
Top=(none), Bottom=C
Top=(none), Bottom=A
Top=(none), Bottom=B

New empty list L=[]

Process the coordinates:

Y=1

Nothing to process (L=[]) Add A to list, L=[ A ]

Y=2

Process A with temporarily having a top and bottom coordinates of Y=1 and 2 (and remember that it also has a temporary left and right coordinates of X=3 and 4 Add C, L=[ A, C ]

Y=3

Process A and C, both now having temporary coordinates of (3, 2)-(4, 3) Add B, L=[ A, B, C ]

Y=4

Process A, B and C, all having coordinates of (3, 3)-(4, 4) Remove C, L=[ A, B ]

Y=5

Process A and B, coordinates (3, 4)-(4, 5) Remove A, L=[ B ]

Y=6

Process B, coordinates (3, 5)-(4, 6)

Final output:

pairs of Y-coordinates, with rectangles associated with them (that also have temporarily got new X-coordinates):

(3, 1)-(4, 2) - A
(3, 2)-(4, 3) - A, C
(3, 3)-(4, 4) - A, B, C
(3, 4)-(4, 5) - A, B
(3, 5)-(4, 6) - B

Now, suppose you want to ask the question: Is B fully covered by all any combination of the other rectangles.

The answer can be worked out as follows:

Loop through all the rectangles in the final list above
If B is NOT part of the associated rectangle, ignore this rectangle
If there is any other of the original rectangles associated with the coordinates, then this part of B is covered by that/those rectangle(s)
If there is no other of the original rectangles associated with the coordinates, then this part of B is not covered.

In the above example, we see that the 3rd and 4rd rectangle in the final list contains B, but also contains other rectangles, hence those parts of B is covered, but the final rectangle in the list also contains B, but no other rectangle, hence this part is not covered.

According to the original diagram, the final result would include rectangles as follows (the letters inside each denote which original rectangle is associated with this new rectangle):

+--+-----+----+-----+
|A |A    |A   |A    |
|  |     +----+-----+-----+
|  |     |AC  |AC   |C    |
|  +-----+----+     |     |
|  |AB   |ABC |     |     |
|  |     +----+-----+-----+
|  |     |AB  |A    |
+--+-----+----+-----+
   |B    |B   |
   +-----+----+

If we now take a new look at the original diagram, I have shaded out the parts that the above algorithm would find contains B, but no other rectangle:

+-------------------+
|A                  |
|        +----------+-----+
|        |C         |     |
|  +-----+----+     |     |
|  |B    |    |     |     |
|  |     +----+-----+-----+
|  |          |     |
+--+-----+----+-----+
   |#####|####|
   +-----+----+

The vertical bar in the middle there is to illustrate that the part would be returned as two rectangles, split at that location, due to the way the vertical strips were worked out.

I seriously hope I made myself understood here. I have some code that can help you with the implementation of each run through the lists of coordinates, but it's 01:21 past midnight here and I'm going to bed, but leave a comment if you wish to see some actual code for this.

Or it would be a great exercise to implement it yourself :)

Here's the link to the class containing the method in question: RangeExtensions.cs.

The method is the Slice method, just search the page for it. The type in question, Range, is basically a range from one value to another, so there's a bit of data construction and maintenance in addition to the above algorithm.

0人赞添加讨论(0) 举报

Bombasti

6楼-- · 2019-01-22 13:32

Here's a generic algorithm

is there any rectangle covering the whole model?
if yes you are done
if no are there any rectangles partially covering the model?
if yes
is the union of all the intersections of rectangle and model equal to the model
if 2. or 3. is no then the answer is no, otherwise it is yes

Now the question is how to do the above efficiently. The above can be done in a single loop over all polygons, so I think you are looking at O(n) time.

If you need to create efficient algorithm that will test multiple models, or if you must optimize for fastest answer possible (at the expense of preparing the data) then you are looking for a structure that will allow quick answer to question if a rectangle intersects (or contains) a rectangle.

If you use, for example kd-trees, I believe that this can be answered in O(log n) time - but the important variable in this algorithm is also the number of found rectangles k. You will end up with something like O(k + log n) and you will also need to do the union part to test if the model is covered.

My guess is that the union could be computed in O(k log k), so if k is small then you are looking at O(log n) and if k is comparable to n then it should be O(k log k).

Rectangles Covering

X=1

X=2

X=3

X=4

X=5 and X=6

Y=1

Y=2

Y=3

Y=4

Y=5

Y=6

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间