I have two lists of dictionaries which I need to compare:
search_list =[{'Chr':'chr1','St':'2345','End':'2456'},
{'Chr':'chr1','St':'3457','End':'4567'}]
and
database = [{'Chr':'chr1','St':'2348','End':'2348'},
{'Chr':'chr1','St':'190','End':'190'}]
Therefore each dictionary within the list looks like this:
{'Chr':'chr1','St':'2345','End':'2456'}
Where Chr = Chromosome, St = start coordinate, and End = end coordinate.
I need to identify the values in the database which match on the Chr value and are between the St and End values (which need to be integers, rather than their current string form). From what I can tell list comprehension is the best way to accomplish this. Looking at this question has been my conceptual starting point, but I'm a bit confused about how to proceed.
My pseudocode is:
matched = [var for var in search_list where search_list['Chr'] == database['Chr'] AND search_list['St'] >= database['St'] <= search_list['End']]
But obviously that's very rough! Any suggestions would be most welcome.
Update I've tried KobiK's suggestion which looks like it should work. But now am having trouble transforming the values attached to the 'St' and 'End' keys from strings to integers in order to do the mathematical searching.
Here's what I've got:
matchedVars ={[record for record in database for lookup in search_list if
record['Chr'] == lookup['Chr'] if int(lookup['St']) <= int(record['St'])
<= int(lookup['End'])]}
But it throws this error:
File "snp_freq_V1-4.py", line 86, in
matchedVars ={[record for record in database for lookup in search_list if record['Chr'] == lookup['Chr'] if int(lookup['St']) <= int(record['St']) <= int(lookup['End'])]}
TypeError: string indices must be integers
Is there a better/more appropriate way to transform keys from strings to ints?
This is a rough code, but how about this?
Hope it will help
You can use list comprehension: This code will iterate elemnts in
database
dict and will check for criteria insearch_list
Output:
As you can see in the output only
{'Chr': 'chr1', 'End': '2348', 'St': '2348'}
is matching the criteria's inside thesearch_list