List comprehensions splitting loop variable

2020-03-08 09:00发布

问题:

I am trying to find out if there is a way to split the value of each iteration of a list comprehension only once but use it twice in the output.

As an example of the problem I am trying to solve is, I have the string:

a = "1;2;4\n3;4;5"

And I would like to perform this:

>>> [(x.split(";")[1],x.split(";")[2]) for x in a.split("\n") if x.split(",")[1] != 5]
[('2', '4'), ('4', '5')]

Without the need for running split three times. So something like this (Which is obviously invalid syntax but hopefully is enough to get the message across):

[(x[1],x[2]) for x.split(";") in a.split("\n") if x[1] != 5]

In this question I am not looking for fancy ways to get the 2nd and 3rd column of the string. It is just a way of providing a concrete example. I could for course for the example use:

[x.split(";")[1:3] for x in a.split("\n")]

The possible solutions I have thought of:

  1. Not use a list comprehension
  2. Leave it as is
  3. Use the csv.DictReader, name my columns and something like StringIO to give it the input.

This is mostly something that would be a nice pattern to be able to use rather than a specific case so its hard to answer the "why do you want to do this" or "what is this for" kind of questions

Update: After being reading the solution below I went and ran some speed tests. And I found in my very basic tests that the solution provided was 35% faster than the naive solution above.

回答1:

You could use a list comprehension wrapped around a generator expression:

[(x[1],x[2]) for x in (x.split(";") for x in a.split("\n")) if x[1] != 5]


回答2:

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same expression:

In our case, we can name the evaluation of line.split(';') as a variable parts while using the result of the expression to filter the list if parts[1] is not equal to 5; and thus re-use parts to produce the mapped value:

# text = '1;2;4\n3;4;5'
[(parts[1], parts[2]) for line in text.split('\n') if (parts := line.split(';'))[1] != 5]
# [('2', '4'), ('4', '5')]