How to find range overlap in python?

2019-01-14 21:18发布

问题:

What is the best way in Python to determine what values in two ranges overlap?

For example:

x = range(1,10)
y = range(8,20)

(The answer I am looking for would be the integers 8 and 9.)

Given a range, x, what is the best way to iterate through another range, y and output all values that are shared by both ranges? Thanks in advance for the help.

EDIT:

As a follow-up, I realized that I also need to know if x does or does not overlap y. I am looking for a way to iterate through a list of ranges and and do a number of additional things with range that overlap. Is there a simple True/False statement to accomplish this?

回答1:

Try with set intersection:

>>> x = range(1,10)
>>> y = range(8,20)
>>> xs = set(x)
>>> xs.intersection(y)
set([8, 9])

Note that intersection accepts any iterable as an argument (y is not required to be converted to a set for the operation). There is an operator equivalent to the intersection method: & but, in this case, it requires both arguments to be sets.



回答2:

If the step is always +1 (which is the default for range) the following should be more efficient than converting each list to a set or iterating over either list:

range(max(x[0], y[0]), min(x[-1], y[-1])+1)


回答3:

You can use sets for that, but be aware that set(list) removes all duplicate entries from the list:

>>> x = range(1,10)
>>> y = range(8,20)
>>> list(set(x) & set(y))
[8, 9]


回答4:

One option is to just use list comprehension like:

x = range(1,10) 
y = range(8,20) 

z = [i for i in x if i in y]
print z


回答5:

For "if x does or does not overlap y" :

for a,b,c,d in ((1,10,10,14),
                (1,10,9,14),
                (1,10,4,14),
                (1,10,4,10),
                (1,10,4,9),
                (1,10,4,7),
                (1,10,1,7),
                (1,10,-3,7),
                (1,10,-3,2),
                (1,10,-3,1),
                (1,10,-11,-5)):
    x = range(a,b)
    y = range(c,d)
    print 'x==',x
    print 'y==',y
    b = not ((x[-1]<y[0]) or (y[-1]<x[0]))
    print '    x %s y' % ("does not overlap","   OVERLAPS  ")[b]
    print

result

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [10, 11, 12, 13]
    x does not overlap y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [9, 10, 11, 12, 13]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [1, 2, 3, 4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0]
    x does not overlap y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-11, -10, -9, -8, -7, -6]
    x does not overlap y

Edit 1

Speeds comparison:

from time import clock

x = range(-12,15)
y = range(-5,3)
te = clock()
for i in xrange(100000):
    w = set(x).intersection(y)
print '                     set(x).intersection(y)',clock()-te


te = clock()
for i in xrange(100000):
    w = range(max(x[0], y[0]), min(x[-1], y[-1])+1)
print 'range(max(x[0], y[0]), min(x[-1], y[-1])+1)',clock()-te

result

                     set(x).intersection(y) 0.951059981087
range(max(x[0], y[0]), min(x[-1], y[-1])+1) 0.377761978129

The ratio of these execution's times is 2.5



回答6:

If you want to find the overlap of ranges with arbitrary steps you can use my package https://github.com/avnr/rangeplus which provides a Range() class compatible with Python range(), plus some goodies including intersections:

>>> from rangeplus import Range
>>> Range(1, 100, 3) & Range(2, 100, 4)
Range(10, 100, 12)
>>> Range(200, -200, -7) & range(5, 80, 2)  # can intersect with Python range() too
Range(67, 4, -14)

Range() can also be unbound (when stop is None the Range goes on to +/-infinity):

>>> Range(1, None, 3) & Range(3, None, 4)
Range(7, None, 12)
>>> Range(253, None, -3) & Range(208, 310, 5)
Range(253, 207, -15)

The intersection is computed, not iterated, which makes the efficiency of the implementation independent of the length of the Range().



回答7:

If you looking for the overlap between two real-valued bounded intervals, then this is quite nice:

def overlap(start1, end1, start2, end2):
    """how much does the range (start1, end1) overlap with (start2, end2)"""
    return max(max((end2-start1), 0) - max((end2-end1), 0) - max((start2-start1), 0), 0)

I couldn't find this online anywhere so I came up with this and I'm posting here.



回答8:

Assuming you are working exclusively with ranges, with a step of 1, you can do it quickly with math.

def range_intersect(range_x,range_y):
    if len(range_x) == 0 or len(range_y) == 0:
        return []
    # find the endpoints
    x = (range_x[0], range_x[-1]) # from the first element to the last, inclusive
    y = (range_y[0], range_y[-1])
    # ensure min is before max
    # this can be excluded if the ranges must always be increasing
    x = tuple(sorted(x))
    y = tuple(sorted(y))
    # the range of the intersection is guaranteed to be from the maximum of the min values to the minimum of the max values, inclusive
    z = (max(x[0],y[0]),min(x[1],y[1]))
    if z[0] < z[1]:
        return range(z[0], z[1] + 1) # to make this an inclusive range
    else:
        return [] # no intersection

On a pair of ranges each with over 10^7 elements, this took under a second, independent of how many elements overlapped. I tried with 10^8 or so elements, but my computer froze for a while. I doubt you'd be working with lists that long.



回答9:

This is the answer for the simple range with step=1 case (99% of the time), which can be 2500x faster as shown in the benchmark when comparing long ranges using sets (when you are just interested in knowing if there is overlap):

x = range(1,10) 
y = range(8,20)

def range_overlapping(x, y):
    if x.start == x.stop or y.start == y.stop:
        return False
    return ((x.start < y.stop  and x.stop > y.start) or
            (x.stop  > y.start and y.stop > x.start))

>>> range_overlapping(x, y)
True

To find the overlapping values:

def overlap(x, y):
    if not range_overlapping(x, y):
        return set()
    return set(range(max(x.start, y.start), min(x.stop, y.stop)+1))

Visual help:

|  |           |    |
  |  |       |    |

Benchmark:

x = range(1,10)
y = range(8,20)

In [151]: %timeit set(x).intersection(y)
2.74 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [152]: %timeit range_overlapping(x, y)
1.4 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Conclusion: even for small ranges, it is twice as fast.

x = range(1,10000)
y = range(50000, 500000)

In [155]: %timeit set(x).intersection(y)
43.1 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [156]: %timeit range_overlapping(x, y)
1.75 µs ± 88.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Conclusion: you want to use the range_overlapping function in this case as it is 2500x faster (my personal record in speedup)



标签: python range