I have a similar question to this question: Determine if 2 lists have the same elements, regardless of order?
What is the best/quickest way to determine whether an unsorted list list1
is contained in a 'list of lists' myListOfLists
, regardless of the order to the elements in list1
? My attempt is wrapped up in the function doSomething(...)
which I call many times:
def doSomething(myListOfLists, otherInputs):
list1 = []
... # do something here with `otherInputs'
... # which gives `list1' some values
# now only append `list1' to `myListOfLists' if it doesn't already exist
# and if it does exist, remove it
removeFromList = False
for myList in myListOfLists:
if sorted(list1) == sorted(myList):
removeFromList = True
break
if removeFromList:
myListOfLists.remove(list1)
else:
myListOfLists.append(list1)
return myListOfLists
The problem with this is that I need to run the function doSomething(...)
approximately 1.0e5 times. As myListOfLists
gets bigger with every call of doSomething(...)
this becomes massively time consuming.
EDIT:
Some clarification of the task. Let me give an example of the desired output here:
a = []
doSomething(a, [1,2,3])
>> a = [1,2,3]
Because [1,2,3]
is not in a
, it is appended to a
.
doSomething(a, [3,4,5])
>> a = [[1,2,3], [3,4,5]]
Because [3,4,5]
is not in a
, it is appended to a
.
doSomething(a, [1,2,3])
>>[3,4,5]
Because [1,2,3]
is in a
, it is removed from a
.
EDIT:
All lists have the same length.
You can use sets here:
def doSomething(myListOfLists, otherInputs):
s = set(otherInputs) #create set from otherInputs
for item in myListOfLists:
#remove the common items between `s` and current sublist from `s`.
s -= s.intersection(item)
#if `s` is empty, means all items found. Return True
if not s:
return True
return not bool(s)
...
>>> doSomething([[1, 2, 7],[6, 5, 4], [10, 9, 10]], [7, 6, 8])
False
>>> doSomething([[1, 2, 7],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
True
Update 1: Any Sublist contains exactly same items as otherInputs
.
def doSomething(myListOfLists, otherInputs):
s = set(otherInputs)
return any(set(item) == s for item in myListOfLists)
...
>>> doSomething([[6, 8, 7],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
True
>>> doSomething([[1, 2, 7],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
False
Update 2: otherInputs
is a subset of any of the sublist:
def doSomething(myListOfLists, otherInputs):
s = set(otherInputs)
return any(s.issubset(item) for item in myListOfLists)
...
>>> doSomething([[6, 8, 7],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
True
>>> doSomething([[6, 8, 7, 10],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
True
>>> doSomething([[1, 2, 7],[6, 5, 4], [10, 8, 10]], [7, 6, 8])
False
Use sets
def doSomething(myDictOfLists, otherInputs):
list1 = []
... # do something here with `otherInputs'
... # which gives `list1' some values
# now only append `list1' to `myListOfLists' if it doesn't already exist
# and if it does exist, remove it
list1Set = set(list1)
if list1Set not in myDictOfLists:
myDictOfLists[list1Set] = list1
return myDictOfLists
If you sort given list and then append it to myListOfLists
you can use this:
if sorted(list1) in myListOfLists:
This algorithm appears to be slightly faster:
l1 = [3, 4, 1, 2, 3]
l2 = [4, 2, 3, 3, 1]
same = True
for i in l1:
if i not in l2:
same = False
break
For 1000000 loops, this takes 1.25399184227 sec on my computer, whilst
same = sorted(l1) == sorted(l2)
takes 1.9238319397 sec.