This question already has an answer here:
- Split Strings with Multiple Delimiters? 29 answers
I found some answers online, but I have no experience with regular expressions, which I believe is what is needed here.
I have a string that needs to be split by either a ';' or ', ' That is, it has to be either a semicolon or a comma followed by a space. Individual commas without trailing spaces should be left untouched
Example string:
"b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3], mesitylene [000108-67-8]; polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]"
should be split into a list containing the following:
('b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3]' , 'mesitylene [000108-67-8]', 'polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]')
Do a
str.replace('; ', ', ')
and then astr.split(', ')
Luckily, Python has this built-in :)
Update:
Following your comment:
This is how the regex look like:
Here's a safe way for any iterable of delimiters, using regular expressions:
re.escape allows to build the pattern automatically and have the delimiters escaped nicely.
Here's this solution as a function for your copy-pasting pleasure:
If you're going to split often using the same delimiters, compile your regular expression beforehand like described and use
RegexObject.split
.In response to Jonathan's answer above, this only seems to work for certain delimiters. For example:
By putting the delimiters in square brackets it seems to work more effectively.