Replace text while keeping case intact in C#

2019-04-06 19:17发布

问题:

I have a set of sentences i need to use to do a replace, for example:

abc => cde
ab df => de
...

And i have a text where to make the changes. However i have no way to know beforehand case of said text. So, for example, if i have:

A bgt abc hyi. Abc Ab df h

I must replace and get:

A bgt cde nyi. Cde De h

Or as close to that as possible, i.e. keep case

EDIT: As i am seeing to much confusion about this i will try to clarify a bit:

I am asking about a way to keep caps after replacing and i don't think that passed through well (not well explained what thaat entails) so i will give a more realistic example using real words.

think of it like a gossary, replacing expressions by their sinonyms so to speak, so if i map:

didn't achieve success => failled miserably

then the i get as input the setence:

As he didn't achieve success, he was fired

i would get

As he failled miserably, he was fired

but if didn't was capitalized, so would failled, if achieve or success was capitalized, so would miserably, if any had more than 1 letter capitalized, so would it's counterpart

My main possibilities are (ones i really want to take into cosideration)

  • only first letter of first word capitalized
  • only first letter of every word capitalized
  • all letters capitalized

If i can handle those three that would be acceaptable already i guess - it's the easyer ones - of course a more in depth solution would be better if availlable

Any ideas?

回答1:

Not sure how well this will work, but this is what I came up with:

        string input = "A bgt abc hyi. Abc Ab df h";
        Dictionary<string, string> map = new Dictionary<string, string>();
        map.Add("abc", "cde");
        map.Add("ab df", "de");

        string temp = input;
        foreach (var entry in map)
        {
            string key = entry.Key;
            string value = entry.Value;
            temp = Regex.Replace(temp, key, match =>
            {
                bool isUpper = char.IsUpper(match.Value[0]);

                char[] result = value.ToCharArray();
                result[0] = isUpper
                    ? char.ToUpper(result[0])
                    : char.ToLower(result[0]);
                return new string(result);
            }, RegexOptions.IgnoreCase);
        }
        label1.Text = temp; // output is A bgt cde hyi. Cde De h

EDIT After reading the modified question, here's my modified code (it turns out to be similar steps to @Sephallia's code.. and similar variable names lol )

The code now is a bit more complicated.. but I think it's ok

        string input = 
        @"As he didn't achieve success, he was fired.
        As he DIDN'T ACHIEVE SUCCESS, he was fired.
        As he Didn't Achieve Success, he was fired.
        As he Didn't achieve success, he was fired.";
        Dictionary<string, string> map = new Dictionary<string, string>();
        map.Add("didn't achieve success", "failed miserably");


        string temp = input;
        foreach (var entry in map)
        {
            string key = entry.Key;
            string value = entry.Value;
            temp = Regex.Replace(temp, key, match =>
            {
                bool isFirstUpper, isEachUpper, isAllUpper;

                string sentence = match.Value;
                char[] sentenceArray = sentence.ToCharArray();

                string[] words = sentence.Split(' ');

                isFirstUpper = char.IsUpper(sentenceArray[0]);

                isEachUpper = words.All(w => char.IsUpper(w[0]) || !char.IsLetter(w[0]));

                isAllUpper = sentenceArray.All(c => char.IsUpper(c) || !char.IsLetter(c));

                if (isAllUpper)
                    return value.ToUpper();

                if (isEachUpper)
                {
                    // capitalize first of each word... use regex again :P
                    string capitalized = Regex.Replace(value, @"\b\w", charMatch => charMatch.Value.ToUpper());
                    return capitalized;
                }


                char[] result = value.ToCharArray();
                result[0] = isFirstUpper
                    ? char.ToUpper(result[0])
                    : char.ToLower(result[0]);
                return new string(result);
            }, RegexOptions.IgnoreCase);
        }
        textBox1.Text = temp; 
        /* output is :
        As he failed miserably, he was fired.
        As he FAILED MISERABLY, he was fired.
        As he Failed Miserably, he was fired.
        As he Failed miserably, he was fired.
        */


回答2:

You could use String.IndexOf with StringComparison.CurrentCultureIgnoreCase specified to find a match. At that point, a character by character replacement would work to do the swap. The capitalization could be handled by checking with Char.IsUpper for the source character, and then using Char.ToUpper or Char.ToLower on the destination as appropriate.



回答3:

You could loop through the String as an array of characters and use the Char.IsUpper(char parameter)

  1. Instantiate a blank string
  2. Set up a loop to loop through the characters
  3. Check if you need to change the character to a different one
    1. Yes: Check whether or not the character is upper or lower case, depending on the result, put the appropriate letter in the new string.
    2. No: Just throw that character into the new string
  4. Set the original string to the new string.

Might not be the most efficient or spectacular way of doing things, but it is simple, and it works.

On a side note: I am not sure how you are converting the characters, but if you are say, shifting the characters down the alphabet (when you DO want to convert them) by a constant amount, let's say you're shifting by 3. So a -> d and E -> G or something like that, then you could get the ASCII value from the character, add 3 (if you want to convert it) and then get the character from the ASCII value. As described here. You would have to do checks though to make sure that you loop back from the end of the alphabet. (or the beginning, if you're shifting left).

Edit #1: (Going to keep the above there)

Really big block of code... Sorry! This was the best way I could see to do what you were asking. Hopefully someone might come up with a more elegant way. Please do comment or anything if you require clarification!

    // (to be clear) This is Elias' (original) code modified.
    static void Main(string[] args)
    {
        string input = "As he DIDN'T ACHIEVE Success, he was fired";
        Dictionary<string, string> map = new Dictionary<string, string>();
        map.Add("didn't achieve success", "failed miserably");

        string temp = input;
        foreach (var entry in map)
        {
            string key = entry.Key;
            string value = entry.Value;
            temp = Regex.Replace(temp, key, match =>
            {
                string[] matchSplit = match.Value.Split(' ');
                string[] valueSplit = value.Split(' ');

                // Set the number of words to the lower one.
                // If they're the same, it doesn't matter.
                int numWords = (matchSplit.Length <= valueSplit.Length) 
                    ? matchSplit.Length
                    : valueSplit.Length;

                // only first letter of first word capitalized
                // only first letter of every word capitalized
                // all letters capitalized
                char[] result = value.ToCharArray(); ;
                for (int i = 0; i < numWords; i++)
                {
                    if (char.IsUpper(matchSplit[i][0]))
                    {
                        bool allIsUpper = true;
                        int c = 1;
                        while (allIsUpper && c < matchSplit[i].Length)
                        {
                            if (!char.IsUpper(matchSplit[i][c]) && char.IsLetter(matchSplit[i][c]))
                            {
                                allIsUpper = false;
                            }
                            c++;
                        }
                        // if all the letters of the current word are true, allIsUpper will be true.
                        int arrayPosition = ArrayPosition(i, valueSplit);
                        Console.WriteLine(arrayPosition);
                        if (allIsUpper)
                        {
                            for (int j = 0; j < valueSplit[i].Length; j++)
                            {
                                result[j + arrayPosition] = char.ToUpper(result[j + arrayPosition]);
                            }
                        }
                        else
                        {
                            // The first letter.
                            result[arrayPosition] = char.ToUpper(result[arrayPosition]);
                        }
                    }
                }

                return new string(result);
            }, RegexOptions.IgnoreCase);
        }
        Console.WriteLine(temp); 
    }

    public static int ArrayPosition(int i, string[] valueSplit)
    {
        if (i > 0)
        {
            return valueSplit[i-1].Length + 1 + ArrayPosition(i - 1, valueSplit);
        }
        else
        {
            return 0;
        }

        return 0;
    }


回答4:

Replace one char at a time and use

if(currentChar.ToString() == currentChar.ToUpper(currentChar).ToString())
{
   //replace with upper case variant 
}


回答5:

This is pretty much what Reed was saying. The only trick is that I'm not sure what you should do when the Find and Replace strings are different lengths. So I'm choosing the min length and using that...

static string ReplaceCaseInsensitive(string Text, string Find, string Replace)
{
    char[] NewText = Text.ToCharArray();
    int ReplaceLength = Math.Min(Find.Length, Replace.Length);

    int LastIndex = -1;
    while (true)
    {
        LastIndex = Text.IndexOf(Find, LastIndex + 1, StringComparison.CurrentCultureIgnoreCase);

        if (LastIndex == -1)
        {
            break;
        }
        else
        {
            for (int i = 0; i < ReplaceLength; i++)
            {
                if (char.IsUpper(Text[i + LastIndex])) 
                    NewText[i + LastIndex] = char.ToUpper(Replace[i]);
                else
                    NewText[i + LastIndex] = char.ToLower(Replace[i]);
            }
        }
    }

    return new string(NewText);
}