I have two lists:
big_list = [2, 1, 2, 3, 1, 2, 4]
sub_list = [1, 2]
I want to remove all sub_list occurrences in big_list.
result should be [2, 3, 4]
For strings you could use this:
'2123124'.replace('12', '')
But AFAIK this does not work for lists.
This is not a duplicate of Removing a sublist from a list since I want to remove all sub-lists from the big-list. In the other question the result should be [5,6,7,1,2,3,4]
.
Update: For simplicity I took integers in this example. But list items could be arbitrary objects.
Update2:
if big_list = [1, 2, 1, 2, 1]
and sub_list = [1, 2, 1]
,
I want the result to be [2, 1]
(like '12121'.replace('121', '')
)
Update3:
I don't like copy+pasting source code from StackOverflow into my code. That's why I created second question at software-recommendations: https://softwarerecs.stackexchange.com/questions/51273/library-to-remove-every-occurrence-of-sub-list-from-list-python
Update4: if you know a library to make this one method call, please write it as answer, since this is my preferred solution.
The test should pass this test:
def test_remove_sub_list(self):
self.assertEqual([1, 2, 3], remove_sub_list([1, 2, 3], []))
self.assertEqual([1, 2, 3], remove_sub_list([1, 2, 3], [4]))
self.assertEqual([1, 3], remove_sub_list([1, 2, 3], [2]))
self.assertEqual([1, 2], remove_sub_list([1, 1, 2, 2], [1, 2]))
self.assertEquals([2, 1], remove_sub_list([1, 2, 1, 2, 1], [1, 2, 1]))
self.assertEqual([], remove_sub_list([1, 2, 1, 2, 1, 2], [1, 2]))
Try
del
andslicing
. The worst time complexity isO(N^2)
.result:
How about this:
Performance:
Update
My first solution had a bug. Was able to fix it (updated my code above) but the method looks way more complicated now. In terms of performance it still does better than the solution from Mad Physicist on my local machine.
Kinda different approach in Python 2.x!
Note that you need to delete them in reverse order so that you don't throw off the subsequent indexes.
In Python3, signature of locate() will differ.
Update: The
more_itertools
library has releasedmore_itertool.replace
, a tool that solves this particular problem (see Option 3).First, here are some other options that work on generic iterables (lists, strings, iterators, etc.):
Code
Option 1 - without libraries:
Option 2 - with
more_itertools
Demo
Option 3 - with
more_itertools.replace
:Demo
Details
In all of these examples, we are scanning the main sequence with smaller window slices. We yield whatever is not found in the slice and skip whatever is in the slice.
Option 1 - without libraries
Iterate an enumerated sequence and evaluate slices of size
n
(the length of the sub-sequence). If the upcoming slice equals the sub-sequence, resetskip
and yield the item. Otherwise, iterate past it.skip
tracks how many times to advance the loop, e.g.sublist
is of sizen=2
, so it skips twice per match.Note, you can convert this option to work with sequences alone by removing the first two tuple assignments and replacing the
iterable
parameter withseq
, e.g.def remove(seq, subsequence):
.Option 2 - with
more_itertools
Indices are located for every matching sub-sequence in an iterable. While iterating an enumerated iterator, if an index is found in
indices
, the remaining sub-sequence is skipped by consuming the nextn-1
elements from the iterator. Otherwise, an item is yielded.Install this library via
> pip install more_itertools
.Option 3 - with
more_itertools.replace
:This tool replaces a sub-sequence of items defined in a predicate with substitute values. To remove items, we substitute an empty container, e.g.
substitutes=[]
. The length of replaced items is specified by thewindow_size
parameter (this value is equal to the length of the sub-sequence).What you are trying to achieve can be done by converting it into list of strings and after replacing again convert it to integer type.
In a single line you can do it like this
Input
Output
Input
Ouput
A recursive approach:
This outputs: