I have a list of long strings and I'd like to get the indexes of the list elements that match a substring of strings in another list. Checking if a list item contains a a single string inside a list is easy to do with list comprehensions, like this question:
my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
thing_to_find = "abc"
matching = [i for i, x in enumerate(my_list) if thing_to_find in x]
However, I'd like to check not only if "abc"
is in x
, but if any strings in another list are in the list, like so:
my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
things_to_find = ['abc', 'def']
This obviously doesn't work (but it would be really cool if it did):
matching = [i for i, x in enumerate(my_list) if things_to_find in x]
I can find the list indexes if I run commands individually, but it's tedious and horrible:
print([i for i, x in enumerate(my_list) if 'abc' in x])
# [0, 3]
print([i for i, x in enumerate(my_list) if 'def' in x])
# [1]
What's the best way to find the indexes of all instances where elements from one list are found in another list?
You are looking for the
any()
function here:Demo:
You are close:
which gives
[0,1,3]
.You need to iterate through the
things_to_find
list as well, and see if thekeyword
is inx
.Might be a little slow, but why not try:
Build a regex, then test each list element against that:
Don't rebuild the regex every time you do the search - only rebuild when
things_to_find
changes.I suspect you don't want the indices, but the elements:
Or, if you really do want the indices:
This will likely perform better than an
any(in)
solution (which is quadratic) for largethings_to_find
lists, but will be overkill for short lists. You'll also see more of a gain where the things inthings_to_find
are similar; and less of a gain if you can sortthings_to_find
such that more likely matches occur first, and if matches are likely.Maybe something like?:
Output: