(This post is regarding High Frequency type programming)
I recently saw on a forum (I think they were discussing Java) that if you have to parse a lot of string data its better to use a byte array than a string with a split(). The exact post was:
One performance trick to working with any language, C++, Java, C# is to avoid object creation. It's not the cost of allocation or GC, its the cost to access large memory arrays that dont fit in the CPU cache.
Modern CPU's are much faster than their memory. They stall for many, many cycles for each cache miss. Most of the CPU transister budget is allocated to reduce this with large caches and lots of ticks.
GPU's solve the problem differently by having lots of threads ready to execute to hide memory access latency and have little or no cache and spend the transistors on more cores.
So, for example, rather than using String's and split to parse a message, use byte arrays that can be updated in place. You really want to avoid random memory access over large data structures, at least in the inner loops.
Is he just saying "dont use strings because they're an object and creating objects is costly" ? Or is he saying something else?
Does using a byte array ensure the data remains in the cache for as long as possible? When you use a string is it too large to be held in the CPU cache? Generally, is using the primitive data types the best methods for writing faster code?