A periodic computer generated message (simplified):
Hello user123,
- (604)7080900
- 152
- minutes
Regards
Using python, how can I extract "(604)7080900", "152", "minutes" (i.e. any text following a leading "- "
pattern) between the two empty lines (empty line is the \n\n
after "Hello user123" and the \n\n
before "Regards"). Even better if the result string list are stored in an array. Thanks!
edit: the number of lines between two blank lines are not fixed.
2nd edit:
e.g.
hello
- x1
- x2
- x3
- x4
- x6
morning
- x7
world
x1 x2 x3 are good, as all lines are surrounded by 2 empty lines, x4 is also good for the same reason. x6 is not good because no blank line follows it, x7 is not good as no blank in front of it. x2 is good (not like x6, x7) because the line ahead is a good line and the line following it is also good.
this conditions might be not clear when I posted the question:
a continuous of good lines between 2 empty lines
good line must have leading "- "
good line must follow an empty line or follow another good line
good line must be followed by an empty line or followed by another good line
thanks
The simplest approach is to go over these lines (assuming you have a list of lines, or a file, or split the string into a list of lines) until you see a line that's just
'\n'
, then check that each line starts with'- '
(using thestartswith
string method) and slicing it off, storing the result, until you find another empty line. For example:Edited: Since you elaborate on what you want to do, here's an updated version of the loops. It no longer loops twice, but instead collects data until it encounters a 'bad' line, and either saves or discards the collected lines when it encounters a block separator. It doesn't need an explicit iterator, because it doesn't restart iteration, so you can just pass it a list (or any iterable) of lines:
And here it is in action:
do this:
and have this: