Given the following list:
DNA_list = ['ATAT', 'GTGTACGT', 'AAAAGGTT']
I want to filter strings longer than 3 characters. I achieve this with the following code:
With for loop:
long_dna = []
for element in DNA_list:
length = len(element)
if int(length) > 3:
long_dna.append(element)
print long_dna
But I want my code to be more general, so I can later filter strings of any length, so I use a function and for loop:
def get_long(dna_seq, threshold):
return len(dna_seq) > threshold
long_dna_loop2 = []
for element in DNA_list:
if get_long(element, 3) is True:
long_dna_loop2.append(element)
print long_dna_loop2
I want to achieve the same generality using filter()
but I cannot achieve this. If I use the above function get_long()
, I simply cannot pass arguments to it when I use it with filter()
. Is it just not possible or is there a way around it?
My code with filter()
for the specific case:
def is_long(dna):
return len(dna) > 3
long_dna_filter = filter(is_long, DNA_list)
What you are trying to do is known as partial function application: you have a function with multiple arguments (in this case, 2) and want to get a function derived from it with one or more arguments fixed, which you can then pass to
filter
.Some languages (especially functional ones) have this functionality "built in". In python, you can use lambdas to do this (as others have shown) or you can use the
functools
library. In particular,functools.partial
:So you can do:
You can make
is_long
return a function, which can acceptdna
, like thisand then use it in
filter
, like thisNote: Don't use
is
operator to compare booleans or numbers. Instead rely on the truthiness of the data as much as possible. So, in your case, you could have written your second version like thisQuoting programming recommendations in PEP-8,