Python, compute list difference

2019-01-02 17:22发布

In Python, what is the best way to compute the difference between two lists?

example

A = [1,2,3,4]
B = [2,5]

A - B = [1,3,4]
B - A = [5]

标签: python list
12条回答
永恒的永恒
2楼-- · 2019-01-02 18:03

Python 2.7.3 (default, Feb 27 2014, 19:58:35) - IPython 1.1.0 - timeit: (github gist)

def diff(a, b):
  b = set(b)
  return [aa for aa in a if aa not in b]

def set_diff(a, b):
  return list(set(a) - set(b))

diff_lamb_hension = lambda l1,l2: [x for x in l1 if x not in l2]

diff_lamb_filter = lambda l1,l2: filter(lambda x: x not in l2, l1)

from difflib import SequenceMatcher
def squeezer(a, b):
  squeeze = SequenceMatcher(None, a, b)
  return reduce(lambda p,q: p+q, map(
    lambda t: squeeze.a[t[1]:t[2]],
      filter(lambda x:x[0]!='equal',
        squeeze.get_opcodes())))

Results:

# Small
a = range(10)
b = range(10/2)

timeit[diff(a, b)]
100000 loops, best of 3: 1.97 µs per loop

timeit[set_diff(a, b)]
100000 loops, best of 3: 2.71 µs per loop

timeit[diff_lamb_hension(a, b)]
100000 loops, best of 3: 2.1 µs per loop

timeit[diff_lamb_filter(a, b)]
100000 loops, best of 3: 3.58 µs per loop

timeit[squeezer(a, b)]
10000 loops, best of 3: 36 µs per loop

# Medium
a = range(10**4)
b = range(10**4/2)

timeit[diff(a, b)]
1000 loops, best of 3: 1.17 ms per loop

timeit[set_diff(a, b)]
1000 loops, best of 3: 1.27 ms per loop

timeit[diff_lamb_hension(a, b)]
1 loops, best of 3: 736 ms per loop

timeit[diff_lamb_filter(a, b)]
1 loops, best of 3: 732 ms per loop

timeit[squeezer(a, b)]
100 loops, best of 3: 12.8 ms per loop

# Big
a = xrange(10**7)
b = xrange(10**7/2)

timeit[diff(a, b)]
1 loops, best of 3: 1.74 s per loop

timeit[set_diff(a, b)]
1 loops, best of 3: 2.57 s per loop

timeit[diff_lamb_filter(a, b)]
# too long to wait for

timeit[diff_lamb_filter(a, b)]
# too long to wait for

timeit[diff_lamb_filter(a, b)]
# TypeError: sequence index must be integer, not 'slice'

@roman-bodnarchuk list comprehensions function def diff(a, b) seems to be faster.

查看更多
柔情千种
3楼-- · 2019-01-02 18:05

In case of a list of dictionaries, the full list comprehension solution works while the set solution raises

TypeError: unhashable type: 'dict'

Test Case

def diff(a, b):
    return [aa for aa in a if aa not in b]

d1 = {"a":1, "b":1}
d2 = {"a":2, "b":2}
d3 = {"a":3, "b":3}

>>> diff([d1, d2, d3], [d2, d3])
[{'a': 1, 'b': 1}]
>>> diff([d1, d2, d3], [d1])
[{'a': 2, 'b': 2}, {'a': 3, 'b': 3}]
查看更多
冷夜・残月
4楼-- · 2019-01-02 18:07
A = [1,2,3,4]
B = [2,5]

#A - B
x = list(set(A) - set(B))
#B - A 
y = list(set(B) - set(A))

print x
print y 
查看更多
与君花间醉酒
5楼-- · 2019-01-02 18:07

most simple way,

use set().difference(set())

list_a = [1,2,3]
list_b = [2,3]
print set(list_a).difference(set(list_b))

answer is set([1])

查看更多
无与为乐者.
6楼-- · 2019-01-02 18:08

You can do a

list(set(A)-set(B))

and

list(set(B)-set(A))
查看更多
墨雨无痕
7楼-- · 2019-01-02 18:13

One liner:

diff = lambda l1,l2: [x for x in l1 if x not in l2]
diff(A,B)
diff(B,A)

Or:

diff = lambda l1,l2: filter(lambda x: x not in l2, l1)
diff(A,B)
diff(B,A)
查看更多
登录 后发表回答