I've several arrays such as:
string[] sArTrigFunctions = {"sin", "cos", "tan", "sinh", "cosh", "tanh", "cot", "sec", "csc", "arcsin", "arccos", "arctan", "coth", "sech", "csch"};
string[] sArGreek = { "alpha", "beta", "chi", "delta", "Delta", "epsi", "varepsilon", "eta", "gamma", "Gamma", "iota", "kappa", "lambda", "Lambda", "lamda", "Lamda", "mu", "nu", "omega", "Omega", "phi", "varphi", "Phi", "pi", "Pi", "psi", "Psi", "rho", "sigma", "Sigma", "tau", "theta", "vartheta", "Theta", "upsilon", "xi", "Xi", "zeta" };
string sArBinOp = {"lt","gt","eq","neq",.....}; etc.
These array elements are used in a text file where these are mixed with each other or with other content of the file. For example: sintheta
, alt
c.
I want to escape these array elements in the file with \
so sintheta
becomes \sin\theta
and altc
becomes a\ltc
. A simple string.replace(...) does not work. For example if I run the following foreach
loop on sArTrigFunctions
array and then on sArGreek
array, it will replace sintheta
in the file to \sinth\eta
. If I rearrange the order of sArGreek
elements in descending order by length of elements so theta comes before eta, then the following code will first change sintheta
to \sin\theta
and then to \sin\th\eta
. Likewise, running the following code on sArBinOp
array will replace sindelta
to sinde\lta
or if we first run the following code on sArGreek
and then on sArGreek
the sindelta
gets changed to \sin\de\lta
:
foreach (string s in sArGreek)
{
strfileContent = strfileContent.Replace(s, "\\" + s);
}
Question: How can we programmatically make it so that during the replace process if an array element is inside another array element of any array don't escape it with \
. For example don't escape eta
in sintheta
but do so in sineta
. Likewise, don't escape lt
in sindelta
but do so in altc
Note: The array elements in the file are not not necessarily separated by a space, i.e. sintheta
is not written as sin theta
otherwise we could use C# Regex Word Boundary to achieve this using the code like the following, for example:
foreach (string s in sArGreek)
{
strfileContent = Regex.Replace(strfileContent, "\\b" + s + "\\b", "\\" + s + " ");
}
You can do this with a regular expression replace.
First you need to construct your Regex from the input arrays. The structure of the expression is:
Meaning, all the terms in a single string, separated by "|" (regex OR), sorted by descending term length. This is important since we want to capture longer terms when possible, and fallback to shorter terms when needed.
To do that, a little LINQ query comes handy:
We're creating a single enumerable from all our arrays, then sorting by length, and joining to a single string. This is used to create a
Regex
object.Then it's a simple replace:
"\\$&"
means replace the entire match (single term at a time) with itself prefixed with a backslash.Here's a fiddle