Matching two string lists that partially match int

2019-07-25 14:00发布

问题:

I am trying to match a List containing strings (50 strings) with a list containing strings that are part of some of the strings of the previous list (5 strings). I will post the complete code in order to give context below but I also want to give a short example:

List1 = ['abcd12', 'efgh34', 'ijkl56', 'mnop78']

List2 = ['abc', 'ijk']

I want to return a list of the strings from List1 that have matches in List2. I have tried to do something with set.intersection but it seems you can't do partial matches with it (or at I can't with my limited abilities). I also tried any() but I had no success making it work with my lists. In my book it says I should use a nested loop but I don't know which function I should use and how regarding lists.

Here is the complete code as reference:

#!/usr/bin/env python3.4
# -*- coding: utf-8 -*-

import random

def generateSequences (n):

    L = []
    dna = ["A","G","C","T"]
    for i in range(int(n)):

        random_sequence=''

        for i in range(50):
            random_sequence+=random.choice(dna)

        L.append(random_sequence)

    print(L)
    return L

def generatePrefixes (p, L):

    S = [x[:20] for x in L]
    D = []
    for i in range(p):
        randomPrefix = random.choice(S)
        D.append(randomPrefix)

    return S, D

if __name__ == "__main__":
    L = generateSequences(15)
    print (L)
    S, D = generatePrefixes(5, L)
    print (S)
    print (D)

edit: As this was flagged as a possible duplicate i want to edit this in order to say that in this post python is used and the other is for R. I don't know R and if there are any similarities but it doesn't look like that to me at first glance. Sorry for the inconvenience.

回答1:

Using a nested for loop:

def intersect(List1, List2):
    # empty list for values that match
    ret = []
    for i in List2:
        for j in List1:
            if i in j:
                ret.append(j)
    return ret

List1 = ['abcd12', 'efgh34', 'ijkl56', 'mnop78']
List2 = ['abc', 'ijk']
print(intersect(List1, List2))


回答2:

This may not be the most efficient way, but it works

matches = []

for seq_1 in List1:
    for seq_2 in List2:
        if seq_1 in seq_2 or seq_2 in seq_1:
            matches.append(seq_1)
        continue


回答3:

You can just compare strings, I remove any duplicates from a result list from list1 that contain list2 items. This basically does it what you want:

f = []
for i in list1:
  for j in list2:
    if j in i:
      f.append(i)
result = list(set(f))


回答4:

Try

[l1 for l1 in List1 if any([l2 in l1 for l2 in List2])]