Here's is my problem:
- I have a regular expression, this expression contains one, and only one capture group,
- This regular expression cannot be changed,
- I have a string, that will be matched this regular expression,
- The regex will match the complete string, it's not a look-up, if the regex cannot be matched to the string, the function will fail prior reaching this step.
=> I want to get the captured sub-string position in the string, and it's length.
Example;
If my regex is
^.*?\/F?L?(\d+)$
my string is
"( 413) 250/FL250"
I want to get 14
, and 3
.
In those conditions, search would return 1
.
This is a simple example, but we can have extremely complex regex, however the principle is always the same: one and only one capture group, and find the position of the captured string in the main one.
Thanks a lot for your help, I'm stucked.
EDITION:
So I made something with ant (our base work environnement is ant) which consist of getting the leftContext of the capture group, then determine it's size. To get the leftContext, I simply move the parenthesis of the capture groupe at the left part. Ex: \d(\s) becomes (\d)\s.
So there I have a question about it:
<macrodef name="Get_CaptureGroup_Position" >
<attribute name="text" />
<attribute name="mask" />
<attribute name="start" />
<attribute name="end" />
<sequential>
<var name="_GMLCS_modified_regex" unset="true"/>
<var name="_GMLCS_leftContext" unset="true"/>
<var name="_GMLCS_leftContext_len" unset="true"/>
<var name="_GMLCS_CapturedGroup" unset="true"/>
<var name="_GMLCS_CapturedGroup_len" unset="true"/>
<propertyregex property="_GMLCS_modified_regex" override="yes" input="@{mask}" regexp="(.*[^\\])\)([^?].*)" replace="\1\2" />
<propertyregex property="_GMLCS_modified_regex" override="yes" input="${_GMLCS_modified_regex}" regexp="(.*[^\\])\(([^?].*)" replace="\1)\2" />
<var name="_GMLCS_modified_regex" value="(${_GMLCS_modified_regex}" />
<propertyregex property="_GMLCS_leftContext" override="yes" input="@{text}" regexp="${_GMLCS_modified_regex}" select="\1" />
<propertyregex property="_GMLCS_CapturedGroup" override="yes" input="@{text}" regexp="@{mask}" select="\1" />
<getAttributeLength text="${_GMLCS_leftContext}" property="_GMLCS_leftContext_len" />
<getAttributeLength text="${_GMLCS_CapturedGroup}" property="_GMLCS_CapturedGroup_len" />
<math result="_GMLCS_leftContext_len" operation="+" operand1="${_GMLCS_leftContext_len}" operand2="1" />
<math result="_GMLCS_CapturedGroup_len" operation="+" operand1="${_GMLCS_leftContext_len}" operand2="${_GMLCS_CapturedGroup_len}" />
<var name="@{start}" value="${_GMLCS_leftContext_len}" />
<var name="@{end}" value="${_GMLCS_CapturedGroup_len}" />
<var name="_GMLCS_modified_regex" unset="true"/>
<var name="_GMLCS_leftContext" unset="true"/>
<var name="_GMLCS_leftContext_len" unset="true"/>
<var name="_GMLCS_CapturedGroup" unset="true"/>
<var name="_GMLCS_CapturedGroup_len" unset="true"/>
</sequential>
</macrodef>
My question is that, when I pass this regex:
(?:A|.*)/F?L?(\d+)\s*\d*(?:A|.*)
I get:
First property regex:
(?:A|.*)/F?L?(\d+\s*\d*(?:A|.*) = CORRECT
Second propoerty regex:
(?:A|.*)/F?L?)\d+\s*\d*(?:A|.*) = CORRECT
Var:
((?:A|.*)/F?L?)\d+\s*\d*(?:A|.*) = CORRECT
Start and End: 7 and 10 = CORRECT.
This is actually correct, but I believe it should not be, my question is why the ")" at the end of (?:...)
blocks were not removed ?