I am trying to learn Regex to answer a question on SO portuguese.
Input (Array or String on a Cell, so .MultiLine = False
)?
1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With number 0n mid. 4. Number 9 incorrect. 11.12 More than one digit. 12.7 Ending (no word).
Output
1 One without dot.
2. Some Random String.
3.1 With SubItens.
3.2 With number 0n mid.
4. Number 9 incorrect.
11.12 More than one digit.
12.7 Ending (no word).
What i thought was to use Regex with Split, but i wasn't able to implement the example on Excel.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "plum-pear"
Dim pattern As String = "(-)"
Dim substrings() As String = Regex.Split(input, pattern) ' Split on hyphens.
For Each match As String In substrings
Console.WriteLine("'{0}'", match)
Next
End Sub
End Module
' The method writes the following to the console:
' 'plum'
' '-'
' 'pear'
So reading this and this. The RegExr Website was used with the expression /([0-9]{1,2})([.]{0,1})([0-9]{0,2})/igm
on the Input.
And the following is obtained:
Is there a better way to make this? Is the Regex Correct or a better way to generate? The examples that i found on google didn't enlight me on how to use RegEx with Split correctly.
Maybe I am confusing with the logic of Split Function, which i wanted to get the split index and the separator string was the regex.
If VBA's split supports look-behind regex then this one may work, assuming there's no digit except in the indexes:
Use
See the regex demo.
Details
\d+
- 1 or more digits(?:\.\d+)*
- zero or more sequences of:\.
- dot\d+
- 1 or more digits[\s\S]*?
- any 0+ chars, as few as possible, up to the first...\w+\.
- 1+ word chars followed with.
.Here is a sample VBA code:
NOTE
You may require the matches to only stop at the word +
.
that are followed with 0+ whitespaces and a number using\d+(?:\.\d+)*[\s\S]*?[a-zA-Z]+\.(?=\s*(?:\d+|$))
.The
(?=\s*(?:\d+|$))
positive lookahead requires the presence of 0+ whitespaces (\s*
) followed with 1+ digits (\d+
) or end of string ($
) immediately to the right of the current location.