Within VBA, I want to search an excel formula (String) to find cell references.
Specifically I want to find the position(s) in the string where there is either a relative cell reference (ANY relative cell reference, not a specific one), or a mixed cell reference.
- I don't need to find absolute cell references, though I can just check and ignore those.
- I don't mind if the reference is on a different sheet, I'm happy to find the bit after the ! symbol.
- I will need to make sure that it doesn't find things that look like cell references in strings or sheet names or data names.
How can I do this? (This is the main question)
My thoughts:
I can see how I would find mixed cell references - I'm pretty sure the $ symbol can only ever come up in either mixed cell references, absolute references, inside sheet names, or inside strings (can anyone confirm that?), so with the right regular expression or algorithm you could find all those cases and then just check if it is an absolute cell reference and ignore that.
But how to find any and all relative cell references? The only idea I have is similar to the above. Beyond cell references, are there ever any times in Excel Formulas (outside of sheet names or inside strings) where there is a number following a letter? Any formula names or anything? The only other one I can think of is in a defined data name, but I'm not sure if you can use those in a formula, especially a Conditional Formatting formula). Can anyone think of any other times?
Anyone have any ideas?
I wasn't sure what your use case is, but you could try something along the lines in this function:
This function can accept two different data-types:
This allows you to use this as a worksheet function to test the textual value of a formula, or you can use this to test an input string directly.
Here's the return checking a cell:
And here's what the formulas look like for the above:
This can also be used within VBA:
Breaking down the Regular Expression: Regex101
(?:^|[,(=\s])
Requires one of the following to occur before your match^
Start of string; or,
a comma (useful in formulas)!
an exclamation (forSheet!
references)(
opening parenthesis (useful in formulas)=
literal equal sign (useful in formulas)\s
a whitespace character(...)
capturing group that will return your value (one of three options)\$?[A-Z]{1,3}\$?\d+(?::\$?[A-Z]{1,3}\$?\d+)?
not entire row / column\$
for an absolute reference (column), followed by?
making it optional[A-Z]
character class for any letter,+
one or more times\$
for an absolute reference (row), followed by?
making it optional\d
any digit,+
one or more times(?:...)
non capturing group to match a range of ranges (such asA1:B1
)?
, making the entire non-capturing group optional\$?[a-z]{1,3}:\$?[a-z]{1,3}
Entire Columns, such asA:C
\$?\d+:\$?\d+
Entire Rows, such as1:3
(?:$|[\s,)])
Requires one of the following to occur after your match$
end of string; or\s
a whitespace character,
a comma)
closing parenthesisCredits:
Suggestions by tripleee:
- Use Character class
[xyz]
instead ofOR
statements(?:x|y|z)
- Better indention of breakdown
- Grammar usage
Suggestions by Makyen:
- Support for entire rows
1:4
and columnsA:C
- Limit FPs by checking Excel's max column limitation
[a-z]{1,3}
(instead of[a-z]+
)