Splitting digits into groups of threes, from right

2019-03-03 08:10发布

问题:

I have a string '1234567890' that I want split into groups of threes, starting from right to left, with the left most group ranging from one digit to 3-digits (depending on how many digits are left over)

Essentially, it's the same procedure as adding commas to a long number, except, I also want to extract the last three digits as well.

I tried using look-arounds but couldn't figure out a way to get the last three digits.

string = '1234567890'
re.compile(r'\d{1,3}(?=(?:\d{3})+$)')
re.findall(pattern, string)

['1', '234', '567']

Expected output is (I don't need commas):

 ['1', '234', '567', 789]

回答1:

Appreciate that if we add commas from right to left, for each group of three complete digits, then we can simply do a regex replace all of three digits with those three digits followed by a comma. In the code snippet below, I reverse the numbers string, do the comma work, then reverse again to arrive at the output we want.

string = '1234567890'
string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
print string.split(',')
string = '123456789'
string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
print string.split(',')

Output:

['1', '234', '567', '890']
['123', '456', '789']

One part of the regex used for replacement might warrant further explanation. I added a positive lookahead (?=\d{4}) to the start of the pattern. This is there to ensure that we don't add a comma after a final group of three digits, should that occur.

Demo here:

Rextester



回答2:

It is actually easier to operate on a reversed string to keep track of groups of 3 digits where there are more digits to go (with the positive lookahead of (?=\d):

for s in ('123','1234','123456789','1234567890'):
    print(re.sub(r'(\d\d\d)(?=\d)',r'\1,',s[::-1])[::-1])

Or a negative lookahead version:

for s in ('123','1234','123456789','1234567890'):
    print(re.sub(r'(\d\d\d)(?!$)',r'\1,',s[::-1])[::-1])

Either prints:

123
1,234
123,456,789
1,234,567,890

Applying a reversed regex on a reversed string is called a sexeger in Perl ;-)

You can also do a lookahead version that does not require reversing the string:

for s in ('123','1234','123456789','1234567890'):
   print(re.sub(r'(\d)(?=(\d{3})+$)',r'\1,',s))
# same output

Based on the comment, just add an appropriate delimiter and then .split on that:

>>> for s in ('123','1234','123456789','1234567890'):
...     re.sub(r'(\d)(?=(\d{3})+$)',r'\1\t',s).split('\t')
... 
['123']
['1', '234']
['123', '456', '789']
['1', '234', '567', '890']

Or, skip the regex and just do it in Python:

for s in ('123','1234','123456789','1234567890'):
    s=s[::-1]
    n=3
    print([s[i:i+n][::-1] for i in range(0,len(s),n)][::-1])
# same output