I need to replace some parts of a string with each other using C#.
I could find only one similar question about how to achive this here but it was PHP.
My situation involves a Dictionary[string, string] which holds pairs to replace like:
- dog, cat
- cat, mouse,
- mouse, raptor
And I have a string with the value of:
"My dog ate a cat which once ate a mouse got eaten by a raptor"
I need a function to get this:
"My cat ate a mouse which once ate a raptor got eaten by a raptor"
If I enumerate the dictionary and call string.Replace by order, I get this:
"My raptor ate a raptor which once ate a raptor got eaten by a raptor"
It's weird if this hasn't been asked before, (Is it common knowledge?) but I couldn't find any. So I'm sorry if it has and I missed it.
So what you need is for the matching process to only take place once. For once I think the right answer is actually 'use regex' ! Here's some code:
var replacements = new Dictionary<string, string>
{
{ "do|g", "cat" },
{ "ca^t", "mouse" },
{ "mo$$use", "raptor" }
};
var source = "My do|g ate a ca^t which once ate a mo$$use";
var regexPattern =
"(" +
string.Join("|", replacements.Keys.Select(Regex.Escape)) +
")";
var regex = new Regex(regexPattern);
var result = regex.Replace(source, match => replacements[match.Value]);
// Now result == "My cat ate a mouse which once ate a raptor"
The pattern we build here looks like (dog|cat|mouse)
. Each piece in this alternation construct ( | )
is passed through Regex.Escape
, so that regex-meaningful characters in the keys (such as |
, ^
, etc) don't cause problems. When a match is found, the matching text is replaced by the corresponding value in the dictionary. The string is only scanned once, so there's no repeated matching as is the problem with a iterated string.Replace
.
What you want is to have all the string-replaces performed simultaneously, so that you don't get several replacement over the same token, right?
I can think of two approaches, neither particularly elegant:
- Iterate over your dictionary in specific order, so you'll always replace mouse/raptor before cat/mouse.
- Instead of replacing cat with 'mouse', replace it with '$mouse$' or a similar marker. This will ensure that '$mouse$' isn't replaced with 'raptor' (or, actually, '$raptor$'). Then, once you've finished your switches, you can remove all '$' signs.