Why doesn't Python's filter(predicate, set

2019-07-09 01:57发布

Why was Python's filter designed such that if you run filter(my_predicate, some_set), I get back a list object return than a set object?
Are there practical cases where you would not want the result to be a set...?

2条回答
走好不送
2楼-- · 2019-07-09 02:08

You can do a set comprehension.

{my_predicate(x) for x in some_set}      # mapping
{x for x in some_set if my_predicate(x)} # filtering

such as

In [1]: s = set([1,2,3])

In [2]: {x%2 for x in s}
Out[2]: {0, 1}

Many of the "functional" functions in Python 2 are standardized on having list as the output type. This was just an API choice long ago. In itertools many of the same "functional" functions standardize on providing a generator from which you could populate whatever data structure you'd like. And in Python 3 they are standardized on providing an iterator.

But do also note that "filtering" in Python is not like it is in some other languages, like, say Haskell. It's not considered to be a transformation within the context of the data structure, and you don't choose to "endow" your data structures with "filterability" by making them an instance of Functor (or whatever other similar ideas exist in other languages).

As a result, it's a common use case in Python to say something like: "Here's a set, but I just want back all of the values less than 5. I don't care about their 'set-ness' after that point cause I'm just going to do some other work on them, so just give me a ____." No need to get all crazy about preserving the context within which the values originally lived.

In a dynamic typing culture this is very reasonable. But in a static typing culture where preserving the type during transformations might matter, this would be a bit frustrating. It's really just sort of a heuristic from Python's particular perspective.

If it was really just in a very narrow context of a set or tuple then I might just write a helper function:

def type_preserving_filter(predicate, data):
    return type(data)(filter(predicate, data))

such as

>>> type_preserving_filter(lambda x: x > 3, set([1,2,3,4,5,6,7,7]))
{4, 5, 6, 7}
>>> type_preserving_filter(lambda x: x > 3, list([1,2,3,4,5,6,7,7]))
[4, 5, 6, 7, 7]
>>> type_preserving_filter(lambda x: x > 3, tuple([1,2,3,4,5,6,7,7]))
(4, 5, 6, 7, 7)

which works in both Python 2.10 and Python 3.4. In Python 2 this feels a bit wasteful; constructing from the iterator in Python 3 is better.

查看更多
仙女界的扛把子
3楼-- · 2019-07-09 02:30

This is not limited to filter(). But the API has changed in Python 3, where filter() now returns an iterator instead of a list. Quoting the python documentation:

Views And Iterators Instead Of Lists

Some well-known APIs no longer return lists:

...

  • map() and filter() return iterators. If you really need a list, a quick fix is e.g. list(map(...)), but a better fix is often to use a list comprehension (especially when the original code uses lambda), or rewriting the code so it doesn’t need a list at all. Particularly tricky is map() invoked for the side effects of the function; the correct transformation is to use a regular for loop (since creating a list would just be wasteful).

This article written by the author of Python goes into detail for reasons for dropping filter() in Python 3 (but this did not happen as you can see above, although the reasoning is still important.)

The fate of reduce() in Python 3000

...

I think dropping filter() and map() is pretty uncontroversial; filter(P, S) is almost always written clearer as [x for x in S if P(x)], and this has the huge advantage that the most common usages involve predicates that are comparisons, e.g. x==42, and defining a lambda for that just requires much more effort for the reader (plus the lambda is slower than the list comprehension). Even more so for map(F, S) which becomes [F(x) for x in S]. Of course, in many cases you'd be able to use generator expressions instead.

查看更多
登录 后发表回答