Python - Split by comma skipping the content insid

2020-07-29 02:30发布

I need to split a string by commas, but I have a problem with this case:

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD

I would like to split and get:

var[0] = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))"
var[1] = "SECOND"
var[2] = "THIRD"

Thank you

4条回答
Animai°情兽
2楼-- · 2020-07-29 02:32

Here's a very simple parser approach that works for your example:

def top_level_split(s):
    """
    Split `s` by top-level commas only. Commas within parentheses are ignored.
    """

    # Parse the string tracking whether the current character is within
    # parentheses.
    balance = 0
    parts = []
    part = ''

    for c in s:
        part += c
        if c == '(':
            balance += 1
        elif c == ')':
            balance -= 1
        elif c == ',' and balance == 0:
            parts.append(part[:-1].strip())
            part = ''

    # Capture last part
    if len(part):
        parts.append(part.strip())

    return parts

my_list = top_level_split("TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD")
print(my_list)
查看更多
ゆ 、 Hurt°
3楼-- · 2020-07-29 02:43

You can use this negative lookahead based regex:

,(?!(?:[^(]*\([^)]*\))*[^()]*\))

This regex is finding a comma with an assertion that makes sure comma is not in parentheses. This is done using a negative lookahead that first consumes all matching ( and ) and then a ). This assumes parentheses are balanced and unescaped.

RegEx Demo

Code:

>>> s = 'TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)

['TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']

Or:

>>> s = 'TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
>>> print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)
['TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']
查看更多
淡お忘
4楼-- · 2020-07-29 02:45

Thanks to jonrsharpe :

text = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD"
array = re.split(r',(?!.*\))', text)
for item in array:
    # Print and remove the first space
    print item.strip(" ")

Result:

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD
查看更多
做自己的国王
5楼-- · 2020-07-29 02:58

You can just use rsplit:

l1 = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD".rsplit(",", 2)

for line in l1:
   print line

TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD
查看更多
登录 后发表回答