Why are C#/.Net strings length-prefixed and null t

2020-06-08 13:52发布

问题:

After reading What's the rationale for null terminated strings? and some similar questions I have found that in C#/.Net strings are, internally, both length-prefixed and null terminated like in BSTR Data Type.

What is the reason strings are both length-prefixed and null terminated instead of eg. only length-prefixed?

回答1:

Length prefixed so that computing length is O(1).

Null terminated to make marshaling to unmanaged blazing fast (unmanaged likely expects null-terminated strings).



回答2:

Here is an excerpt from Jon Skeet's Blog Post about strings:

Although strings aren't null-terminated as far as the API is concerned, the character array is null-terminated, as this means it can be passed directly to unmanaged functions without any copying being involved, assuming the inter-op specifies that the string should be marshalled as Unicode.



回答3:

Most likely, to ensure easy interoperability with COM.



回答4:

While the length field makes it easy for the framework to determine the length of a string (and it lets string contain characters with a zero value), there's an awful lot of stuff that the framework (or user programs) need to deal with that expect NULL terminated strings.

Like the Win32 API, for example.

So it's convenient to keep a NULL terminator on at the end of the string data because it's likely going to need to be there quite often anyway.

Note that C++'s std::string class is implemented the same way (in MSVC anyway). For the same reason, I'm sure (c_str() is often used to pass a std::string to something that wants a C-style string).



回答5:

Best guess is that finding the length is constant (O(1)) compared to traversing it, running in O(n).