Fastest way to iterate over all the chars in a Str

2020-01-23 04:22发布

In Java, what would the fastest way to iterate over all the chars in a String, this:

String str = "a really, really long string";
for (int i = 0, n = str.length(); i < n; i++) {
    char c = str.charAt(i);
}

Or this:

char[] chars = str.toCharArray();
for (int i = 0, n = chars.length; i < n; i++) {
    char c = chars[i];
}

EDIT :

What I'd like to know is if the cost of repeatedly calling the charAt method during a long iteration ends up being either less than or greater than the cost of performing a single call to toCharArray at the beginning and then directly accessing the array during the iteration.

It'd be great if someone could provide a robust benchmark for different string lengths, having in mind JIT warm-up time, JVM start-up time, etc. and not just the difference between two calls to System.currentTimeMillis().

8条回答
ら.Afraid
2楼-- · 2020-01-23 05:03

Looks like niether is faster or slower

    public static void main(String arguments[]) {


        //Build a long string
        StringBuilder sb = new StringBuilder();
        for(int j = 0; j < 10000; j++) {
            sb.append("a really, really long string");
        }
        String str = sb.toString();
        for (int testscount = 0; testscount < 10; testscount ++) {


            //Test 1
            long start = System.currentTimeMillis();
            for(int c = 0; c < 10000000; c++) {
                for (int i = 0, n = str.length(); i < n; i++) {
                    char chr = str.charAt(i);
                    doSomethingWithChar(chr);//To trick JIT optimistaion
                }
            }

            System.out.println("1: " + (System.currentTimeMillis() - start));

            //Test 2
            start = System.currentTimeMillis();
            char[] chars = str.toCharArray();
            for(int c = 0; c < 10000000; c++) {
                for (int i = 0, n = chars.length; i < n; i++) {
                    char chr = chars[i];
                    doSomethingWithChar(chr);//To trick JIT optimistaion
                }
            }
            System.out.println("2: " + (System.currentTimeMillis() - start));
            System.out.println();
        }


    }


    public static void doSomethingWithChar(char chr) {
        int newInt = chr << 2;
    }

For long strings I'll chose the first one. Why copy around long strings? Documentations says:

public char[] toCharArray() Converts this string to a new character array.

Returns: a newly allocated character array whose length is the length of this string and whose contents are initialized to contain the character sequence represented by this string.

//Edit 1

I've changed the test to trick JIT optimisation.

//Edit 2

Repeat test 10 times to let JVM warm up.

//Edit 3

Conclusions:

First of all str.toCharArray(); copies entire string in memory. It can be memory consuming for long strings. Method String.charAt( ) looks up char in char array inside String class checking index before. It looks like for short enough Strings first method (i.e. chatAt method) is a bit slower due to this index check. But if the String is long enough, copying whole char array gets slower, and the first method is faster. The longer the string is, the slower toCharArray performs. Try to change limit in for(int j = 0; j < 10000; j++) loop to see it. If we let JVM warm up code runs faster, but proportions are the same.

After all it's just micro-optimisation.

查看更多
戒情不戒烟
3楼-- · 2020-01-23 05:04

The second one causes a new char array to be created, and all chars from the String to be copied to this new char array, so I would guess that the first one is faster (and less memory-hungry).

查看更多
老娘就宠你
4楼-- · 2020-01-23 05:05

This is just micro-optimisation that you shouldn't worry about.

char[] chars = str.toCharArray();

returns you a copy of str character arrays (in JDK, it returns a copy of characters by calling System.arrayCopy).

Other than that, str.charAt() only checks if the index is indeed in bounds and returns a character within the array index.

The first one doesn't create additional memory in JVM.

查看更多
一夜七次
5楼-- · 2020-01-23 05:08

Despite @Saint Hill's answer if you consider the time complexity of str.toCharArray(),

the first one is faster even for very large strings. You can run the code below to see it for yourself.

        char [] ch = new char[1_000_000_00];
    String str = new String(ch); // to create a large string

    // ---> from here
    long currentTime = System.nanoTime();
    for (int i = 0, n = str.length(); i < n; i++) {
        char c = str.charAt(i);
    }
    // ---> to here
    System.out.println("str.charAt(i):"+(System.nanoTime()-currentTime)/1000000.0 +" (ms)");

    /**
     *   ch = str.toCharArray() itself takes lots of time   
     */
    // ---> from here
    currentTime = System.nanoTime();
    ch = str.toCharArray();
    for (int i = 0, n = str.length(); i < n; i++) {
        char c = ch[i];
    }
    // ---> to  here
    System.out.println("ch = str.toCharArray() + c = ch[i] :"+(System.nanoTime()-currentTime)/1000000.0 +" (ms)");

output:

str.charAt(i):5.492102 (ms)
ch = str.toCharArray() + c = ch[i] :79.400064 (ms)
查看更多
ゆ 、 Hurt°
6楼-- · 2020-01-23 05:11

Just for curiosity and to compare with Saint Hill's answer.

If you need to process heavy data you should not use JVM in client mode. Client mode is not made for optimizations.

Let's compare results of @Saint Hill benchmarks using a JVM in Client mode and Server mode.

Core2Quad Q6600 G0 @ 2.4GHz
JavaSE 1.7.0_40

See also: Real differences between "java -server" and "java -client"?


CLIENT MODE:

len =      2:    111k charAt(i),  105k cbuff[i],   62k new[i],   17k field access.   (chars/ms) 
len =      4:    285k charAt(i),  166k cbuff[i],  114k new[i],   43k field access.   (chars/ms) 
len =      6:    315k charAt(i),  230k cbuff[i],  162k new[i],   69k field access.   (chars/ms) 
len =      8:    333k charAt(i),  275k cbuff[i],  181k new[i],   85k field access.   (chars/ms) 
len =     12:    342k charAt(i),  342k cbuff[i],  222k new[i],  117k field access.   (chars/ms) 
len =     16:    363k charAt(i),  347k cbuff[i],  275k new[i],  152k field access.   (chars/ms) 
len =     20:    363k charAt(i),  392k cbuff[i],  289k new[i],  180k field access.   (chars/ms) 
len =     24:    375k charAt(i),  428k cbuff[i],  311k new[i],  205k field access.   (chars/ms) 
len =     28:    378k charAt(i),  474k cbuff[i],  341k new[i],  233k field access.   (chars/ms) 
len =     32:    376k charAt(i),  492k cbuff[i],  340k new[i],  251k field access.   (chars/ms) 
len =     64:    374k charAt(i),  551k cbuff[i],  374k new[i],  367k field access.   (chars/ms) 
len =    128:    385k charAt(i),  624k cbuff[i],  415k new[i],  509k field access.   (chars/ms) 
len =    256:    390k charAt(i),  675k cbuff[i],  436k new[i],  619k field access.   (chars/ms) 
len =    512:    394k charAt(i),  703k cbuff[i],  439k new[i],  695k field access.   (chars/ms) 
len =   1024:    395k charAt(i),  718k cbuff[i],  462k new[i],  742k field access.   (chars/ms) 
len =   2048:    396k charAt(i),  725k cbuff[i],  471k new[i],  767k field access.   (chars/ms) 
len =   4096:    396k charAt(i),  727k cbuff[i],  459k new[i],  780k field access.   (chars/ms) 
len =   8192:    397k charAt(i),  712k cbuff[i],  446k new[i],  772k field access.   (chars/ms) 

SERVER MODE:

len =      2:     86k charAt(i),   41k cbuff[i],   46k new[i],   80k field access.   (chars/ms) 
len =      4:    571k charAt(i),  250k cbuff[i],   97k new[i],  222k field access.   (chars/ms) 
len =      6:    666k charAt(i),  333k cbuff[i],  125k new[i],  315k field access.   (chars/ms) 
len =      8:    800k charAt(i),  400k cbuff[i],  181k new[i],  380k field access.   (chars/ms) 
len =     12:    800k charAt(i),  521k cbuff[i],  260k new[i],  545k field access.   (chars/ms) 
len =     16:    800k charAt(i),  592k cbuff[i],  296k new[i],  640k field access.   (chars/ms) 
len =     20:    800k charAt(i),  666k cbuff[i],  408k new[i],  800k field access.   (chars/ms) 
len =     24:    800k charAt(i),  705k cbuff[i],  452k new[i],  800k field access.   (chars/ms) 
len =     28:    777k charAt(i),  736k cbuff[i],  368k new[i],  933k field access.   (chars/ms) 
len =     32:    800k charAt(i),  780k cbuff[i],  571k new[i],  969k field access.   (chars/ms) 
len =     64:    800k charAt(i),  901k cbuff[i],  800k new[i],  1306k field access.   (chars/ms) 
len =    128:    1084k charAt(i),  888k cbuff[i],  633k new[i],  1620k field access.   (chars/ms) 
len =    256:    1122k charAt(i),  966k cbuff[i],  729k new[i],  1790k field access.   (chars/ms) 
len =    512:    1163k charAt(i),  1007k cbuff[i],  676k new[i],  1910k field access.   (chars/ms) 
len =   1024:    1179k charAt(i),  1027k cbuff[i],  698k new[i],  1954k field access.   (chars/ms) 
len =   2048:    1184k charAt(i),  1043k cbuff[i],  732k new[i],  2007k field access.   (chars/ms) 
len =   4096:    1188k charAt(i),  1049k cbuff[i],  742k new[i],  2031k field access.   (chars/ms) 
len =   8192:    1157k charAt(i),  1032k cbuff[i],  723k new[i],  2048k field access.   (chars/ms) 

CONCLUSION:

As you can see, server mode is much faster.

查看更多
Emotional °昔
7楼-- · 2020-01-23 05:15

String.toCharArray() creates new char array, means allocation of memory of string length, then copies original char array of string using System.arraycopy() and then returns this copy to caller. String.charAt() returns character at position i from original copy, that's why String.charAt() will be faster than String.toCharArray(). Although, String.toCharArray() returns copy and not char from original String array, where String.charAt() returns character from original char array. Code below returns value at the specified index of this string.

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}

code below returns a newly allocated character array whose length is the length of this string

public char[] toCharArray() {
    // Cannot use Arrays.copyOf because of class initialization order issues
    char result[] = new char[value.length];
    System.arraycopy(value, 0, result, 0, value.length);
    return result;
}
查看更多
登录 后发表回答