I have a list of product codes in a text file, on each like is the product code that looks like:
abcd2343 abw34324 abc3243-23A
So it is letters followed by numbers and other characters.
I want to split on the first occurrence of a number.
I have a list of product codes in a text file, on each like is the product code that looks like:
abcd2343 abw34324 abc3243-23A
So it is letters followed by numbers and other characters.
I want to split on the first occurrence of a number.
Try this code it will work fine
Output:
['MARIA APARECIDA', '99223-2000 / 98450-8026']
Or, if you want to split on the first occurrence of a digit:
\d+
matches 1-or-more digits.\d*\D+
matches 0-or-more digits followed by 1-or-more non-digits.\d+|\D+
matches 1-or-more digits or 1-or-more non-digits.Consult the docs for more about Python's regex syntax.
re.split(pat, s)
will split the strings
usingpat
as the delimiter. Ifpat
begins and ends with parentheses (so as to be a "capturing group"), thenre.split
will return the substrings matched bypat
as well. For instance, compare:In contrast,
re.findall(pat, s)
returns only the parts ofs
that matchpat
:Thus, if
s
ends with a digit, you could avoid ending with an empty string by usingre.findall('\d+|\D+', s)
instead ofre.split('(\d+)', s)
:This covers your corner case of abc3243-23A and will output
abc
for the letters group and 3243-23A forthe_rest
Since you said they are all on individual lines you'll obviously need to put a line at a time in
input
To partition on the first digit
So the two parts are always parts[0] and parts[1].
Of course, you can apply this to multiple codes:
If each code is in an individual line then instead of
s.split( )
uses.splitlines()
.