Testing if a list contains another list with Pytho

2019-01-04 02:53发布

How can I test if a list contains another list (ie. it's a contiguous subsequence). Say there was a function called contains:

contains([1,2], [-1, 0, 1, 2]) # Returns [2, 3] (contains returns [start, end])
contains([1,3], [-1, 0, 1, 2]) # Returns False
contains([1, 2], [[1, 2], 3]) # Returns False
contains([[1, 2]], [[1, 2], 3]) # Returns [0, 0]

Edit:

contains([2, 1], [-1, 0, 1, 2]) # Returns False
contains([-1, 1, 2], [-1, 0, 1, 2]) # Returns False
contains([0, 1, 2], [-1, 0, 1, 2]) # Returns [1, 3]

13条回答
Luminary・发光体
2楼-- · 2019-01-04 03:26

If we refine the problem talking about testing if a list contains another list with as a sequence, the answer could be the next one-liner:

def contains(subseq, inseq):
    return any(inseq[pos:pos + len(subseq)] == subseq for pos in range(0, len(inseq) - len(subseq) + 1))

Here unit tests I used to tune up this one-liner:

https://gist.github.com/anonymous/6910a85b4978daee137f

查看更多
甜甜的少女心
3楼-- · 2019-01-04 03:26

Here is my answer. This function will help you to find out whether B is a sub-list of A. Time complexity is O(n).

`def does_A_contain_B(A, B): #remember now A is the larger list
    b_size = len(B)
    for a_index in range(0, len(A)):
        if A[a_index : a_index+b_size]==B:
            return True
    else:
        return False`
查看更多
SAY GOODBYE
4楼-- · 2019-01-04 03:27

Here is my version:

def contains(small, big):
    for i in xrange(len(big)-len(small)+1):
        for j in xrange(len(small)):
            if big[i+j] != small[j]:
                break
        else:
            return i, i+len(small)
    return False

It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.

One point of interest for newbies is that it uses the else clause on the for statement - this is not something I use very often but can be invaluable in situations like this.

This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.

查看更多
放我归山
5楼-- · 2019-01-04 03:29

Smallest code:

def contains(a,b):
    str(a)[1:-1].find(str(b)[1:-1])>=0
查看更多
干净又极端
6楼-- · 2019-01-04 03:31

I tried to make this as efficient as possible.

It uses a generator; those unfamiliar with these beasts are advised to check out their documentation and that of yield expressions.

Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub.

Then it just compares successive values of sequence with the generator yields, resetting the generator if they don't match.

When the generator runs out of values, i.e. reaches the end of sub without being reset, that means that we've found our match.

Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find, except that it returns False instead of -1.

As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st". But the spec says otherwise, so that's how this works.

It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.

def reiterator(sub):
    """Yield elements of a sequence, resetting if sent ``True``."""
    it = iter(sub)
    while True:
        if (yield it.next()):
            it = iter(sub)

def find_in_sequence(sub, sequence):
    """Find a subsequence in a sequence.

    >>> find_in_sequence([2, 1], [-1, 0, 1, 2])
    False
    >>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
    False
    >>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
    (1, 3)
    >>> find_in_sequence("subsequence",
    ...                  "This sequence contains a subsequence.")
    (25, 35)
    >>> find_in_sequence("subsequence", "This one doesn't.")
    False

    """
    start = None
    sub_items = reiterator(sub)
    sub_item = sub_items.next()
    for index, item in enumerate(sequence):
        if item == sub_item:
            if start is None: start = index
        else:
            start = None
        try:
            sub_item = sub_items.send(start is None)
        except StopIteration:
            # If the subsequence is depleted, we win!
            return (start, index)
    return False
查看更多
迷人小祖宗
7楼-- · 2019-01-04 03:33

After OP's edit:

def contains(small, big):
    for i in xrange(1 + len(big) - len(small)):
        if small == big[i:i+len(small)]:
            return i, i + len(small) - 1
    return False
查看更多
登录 后发表回答