Using list comprehension to match values of dictio

2019-06-14 14:05发布

问题:

I have two lists of dictionaries which I need to compare:

search_list =[{'Chr':'chr1','St':'2345','End':'2456'},
              {'Chr':'chr1','St':'3457','End':'4567'}]

and

database = [{'Chr':'chr1','St':'2348','End':'2348'},
            {'Chr':'chr1','St':'190','End':'190'}]

Therefore each dictionary within the list looks like this:

{'Chr':'chr1','St':'2345','End':'2456'}

Where Chr = Chromosome, St = start coordinate, and End = end coordinate.

I need to identify the values in the database which match on the Chr value and are between the St and End values (which need to be integers, rather than their current string form). From what I can tell list comprehension is the best way to accomplish this. Looking at this question has been my conceptual starting point, but I'm a bit confused about how to proceed.

My pseudocode is:

matched = [var for var in search_list where search_list['Chr'] == database['Chr'] AND search_list['St'] >= database['St'] <= search_list['End']]

But obviously that's very rough! Any suggestions would be most welcome.

Update I've tried KobiK's suggestion which looks like it should work. But now am having trouble transforming the values attached to the 'St' and 'End' keys from strings to integers in order to do the mathematical searching.

Here's what I've got:

matchedVars ={[record for record in database for lookup in search_list if 
record['Chr'] == lookup['Chr'] if int(lookup['St']) <= int(record['St']) 
<= int(lookup['End'])]}

But it throws this error:

File "snp_freq_V1-4.py", line 86, in

matchedVars ={[record for record in database for lookup in search_list if record['Chr'] == lookup['Chr'] if int(lookup['St']) <= int(record['St']) <= int(lookup['End'])]}

TypeError: string indices must be integers

Is there a better/more appropriate way to transform keys from strings to ints?

回答1:

You can use list comprehension: This code will iterate elemnts in database dict and will check for criteria in search_list

search_list =[{'Chr':'chr1','St':'2345','End':'2456'},{'Chr':'chr2','St':'3457','End':'4567'}]
database = [{'Chr':'chr1','St':'2348','End':'2348'},{'Chr':'chr2','St':'190','End':'190'}]

print [record for record in database for lookup in search_list
       if record['Chr'] == lookup['Chr'] if lookup['St'] <= record['St'] <= lookup['End']]

Output:

[{'Chr': 'chr1', 'End': '2348', 'St': '2348'}]

As you can see in the output only {'Chr': 'chr1', 'End': '2348', 'St': '2348'} is matching the criteria's inside the search_list



回答2:

This is a rough code, but how about this?

matched = []
for d in search_list:
  for db in database:
    if d["Chr"] == db["Chr"]:
      if db["St"] >= d["St"] and db["St"] <= d["End"]:
        matched.append(db)
print matched

Hope it will help