Given the following list:
DNA_list = ['ATAT', 'GTGTACGT', 'AAAAGGTT']
I want to filter strings longer than 3 characters. I achieve this with the following code:
With for loop:
long_dna = []
for element in DNA_list:
length = len(element)
if int(length) > 3:
long_dna.append(element)
print long_dna
But I want my code to be more general, so I can later filter strings of any length, so I use a function and for loop:
def get_long(dna_seq, threshold):
return len(dna_seq) > threshold
long_dna_loop2 = []
for element in DNA_list:
if get_long(element, 3) is True:
long_dna_loop2.append(element)
print long_dna_loop2
I want to achieve the same generality using filter()
but I cannot achieve this. If I use the above function get_long()
, I simply cannot pass arguments to it when I use it with filter()
. Is it just not possible or is there a way around it?
My code with filter()
for the specific case:
def is_long(dna):
return len(dna) > 3
long_dna_filter = filter(is_long, DNA_list)
Use
lambda
to provide the threshold, like this:Here are a couple of more ways using
lambda
. The first one uses a default keyword argument to hold the desired length. The second simply embeds the desired length in thelambda
body.output
In the first example you could also do:
Similarly, in the second example you could replace the literal
5
with a local (or global) variable, eg:Do you need to use
filter()
? Why not use a more Pythonic list comprehension?Example:
This method has the advantage that it's trivial to convert it to a generator which can provide improved memory and execution depending on your application, e.g. if you have a lot of DNA sequences, and you want to iterate over them, realising them as a list will consume a lot of memory in one go. The equivalent generator simply requires replacing square brackets
[]
with round brackets()
:In Python 2 this performance improvement is not an option with
filter()
because it returns a list. In Python 3filter()
returns a filter object more akin to a generator.I used different solution using inner function and nonlocal scope like below. I have modified this original code for understanding as my code is different.
Hope this helps. :)
You could always create a callable that returns a callable suitable for comparisons done by
filter
as the following example shows:You can have a more general case.
Since function is an object in python, you can create another function, which returns the function you want.
The g is what you really want and f is the function that create it.