What is an efficient way to replace many character

2020-02-23 05:45发布

String handling in Java is something I'm trying to learn to do well. Currently I want to take in a string and replace any characters I find.

Here is my current inefficient (and kinda silly IMO) function. It was written to just work.

public String convertWord(String word)
{
    return word.toLowerCase().replace('á', 'a')
                             .replace('é', 'e')
                             .replace('í', 'i')
                             .replace('ú', 'u')
                             .replace('ý', 'y')
                             .replace('ð', 'd')
                             .replace('ó', 'o')
                             .replace('ö', 'o')
                             .replaceAll("[-]", "")
                             .replaceAll("[.]", "")
                             .replaceAll("[/]", "")
                             .replaceAll("[æ]", "ae")
                             .replaceAll("[þ]", "th");
}

I ran 1.000.000 runs of it and it took 8182ms. So how should I proceed in changing this function to make it more efficient?

Solution found:

Converting the function to this

public String convertWord(String word)
{
    StringBuilder sb = new StringBuilder();

    char[] charArr = word.toLowerCase().toCharArray();

    for(int i = 0; i < charArr.length; i++)
    {
        // Single character case
        if(charArr[i] == 'á')
        {
            sb.append('a');
        }
        // Char to two characters
        else if(charArr[i] == 'þ')
        {
            sb.append("th");
        }
        // Remove
        else if(charArr[i] == '-')
        {
        }
        // Base case
        else
        {   
            sb.append(word.charAt(i));
        }
    }

    return sb.toString();
}

Running this function 1.000.000 times takes 518ms. So I think that is efficient enough. Thanks for the help guys :)

8条回答
beautiful°
2楼-- · 2020-02-23 06:30

Use the function String.replaceAll. Nice article similar with what you want: link

查看更多
对你真心纯属浪费
3楼-- · 2020-02-23 06:35

What i see being inefficient is that you are gonna check again characters that have already been replaced, which is useless.

I would get the charArray of the String instance, iterate over it, and for each character spam a series of if-else like this:

char[] array = word.toCharArray();
for(int i=0; i<array.length; ++i){
    char currentChar = array[i];
    if(currentChar.equals('é'))
        array[i] = 'e';
    else if(currentChar.equals('ö'))
        array[i] = 'o';
    else if(//...
}
查看更多
登录 后发表回答