I used reflection to look at the internal fields of System.String and I found three fields:
m_arrayLength
m_stringLength
m_firstChar
I don't understand how this works.
m_arrayLength is the length of some array. Where is this array? It's apparently not a member field of the string class.
m_stringLength makes sense. It's the length of the string.
m_firstChar is the first character in the string.
So my question is where are the rest of the characters for the string? Where are the contents of the string stored if not in the string class?
Correct answer on difference between string and System.string is here: string vs System.String
There is nothing about native implementations
I'd be thinking immediately that
m_firstChar
is not the first character, rather a pointer to the first character. That would make much more sense (although, since I'm not privy to the source, I can't be certain).It makes little sense to store the first character of a string unless you want a blindingly fast
s.substring(0,1)
operation :-) There's a good chance the characters themselves (that the three fields allude to) will be allocated separately from the actual object.Much of the implementation of
System.String
is in native code (C/C++) and not in managed code (C#). If you take a look at the decompiled code you'll see that most of the "interesting" or "core" methods are decorated with this attribute:Only some of the helper/convenience APIs are implemented in C#.
So where are the characters for the string stored? It's top secret! Deep down inside the CLR's core native code implementation.
The first char provides access (via
&m_firstChar
) to an address in memory of the first character in the buffer. The length tells it how many characters are in thestring
, making.Length
efficient (better than looking for anul
char). Note that strings can be oversized (especially if created withStringBuilder
, and a few other scenarios), so sometimes the actual buffer is actually longer than the string. So it is important to track this. StringBuilder, for example, actually mutates a string within its buffer, so it needs to know how much it can add before having to create a larger buffer (seeAppendInPlace
, for example).