How are strings passed in .NET?

2019-01-02 14:23发布

问题:

When I pass a string to a function, is a pointer to the string's contents passed, or is the entire string passed to the function on the stack like a struct would be?

回答1:

To answer your question, consider the following code:

void Main()
{
    string strMain = "main";
    DoSomething(strMain);
    Console.Write(strMain); // What gets printed?
}
void DoSomething(string strLocal)
{
    strLocal = "local";
}

There are three things you need to know in order to predict what will happen here, and to understand why it does.

  1. Strings are reference types in C#. But this is only part of the picture.
  2. They are also immutable, so any time you do something that looks like you're changing the string, you aren't. A completely new string gets created, the reference is pointed at it, and the old one gets thrown away.
  3. Even though strings are reference types, strMain isn't passed by reference. It's a reference type, but the reference is being passed by value. This is a tricky distinction, but it's a crucial one. Any time you pass a parameter without the ref keyword (not counting out parameters), you've passed something by value.

But what does that mean?

Passing reference types by value: You're already doing it

There are two groups of data types in C#: reference types and value types. There are also two ways to pass parameters in C#: by reference and by value. These sound the same and are easily confused. They are NOT the same thing!

If you pass a parameter of ANY type, and you don't use the ref keyword, then you've passed it by value. If you've passed it by value, what you really passed was a copy. But if the parameter was a reference type, then the thing you copied was the reference, not whatever it was pointing at.

Here's the first line of our Main method:

string strMain = "main";

There are actually two things we've created on this line: a string with the value main stored off in memory somewhere, and a reference variable called strMain pointing to it.

DoSomething(strMain);

Now we pass that reference to DoSomething. We've passed it by value, so that means we made a copy. But it's a reference type, so that means we copied the reference, not the string itself. Now we have two references that each point to the same value in memory.

Inside the callee

Here's the top of the DoSomething method:

void DoSomething(string strLocal)

No ref keyword, as usual. So strLocal isn't strMain, but they both point to the same place. If we "change" strLocal, like this...

strLocal = "local";   

...we haven't changed the stored value, per se. We've re-pointed the reference. We took the reference called strLocal and aimed it at a brand new string. What happens to strMain when we do that? Nothing. It's still pointing at the old string!

string strMain = "main"; //Store a string, create a reference to it
DoSomething(strMain);    //Reference gets copied, copy gets re-pointed
Console.Write(strMain);  //The original string is still "main" 

Immutability is important

Let's change the scenario for a second. Imagine we aren't working with strings, but some mutable reference type, like a class you've created.

class MutableThing
{
    public int ChangeMe { get; set; }
}

If you follow the reference objLocal to the object it points to, you can change its properties:

void DoSomething(MutableThing objLocal)
{
     objLocal.ChangeMe = 0;
} 

There's still only one MutableThing in memory, and both the copied reference and the original reference still point to it. The properties of the MutableThing itself have changed:

void Main()
{
    var objMain = new MutableThing();
    objMain.ChangeMe = 5; 
    Console.Write(objMain.ChangeMe);  //it's 5 on objMain

    DoSomething(objMain);             //now it's 0 on objLocal
    Console.Write(objMain.ChangeMe);  //it's also 0 on objMain   
}

Ah, but...

...strings are immutable! There's no ChangeMe property to set. You can't do strLocal[3] = 'H'; like you could with a C-style char array; you have to construct a whole new string instead. The only way to change strLocal is to point the reference at another string, and that means nothing you do to strLocal can affect strMain. The value is immutable, and the reference is a copy.

So even though strings are reference types, passing them by value means whatever goes on in the callee won't affect the string in the caller. But since they are reference types, you don't have to copy the entire string in memory when you want to pass it around.

Further resources:

  • Here is the best article I've read on the difference between reference types and value types in C#, and why a reference type isn't the same as a reference-passed parameter.
  • As usual, Eric Lippert also has several excellent blog posts on the subject.
  • He has some great stuff on immutability, too.


回答2:

Strings in C# are immutable reference objects. This means that references to them are passed around (by value), and once a string is created, you cannot modify it. Methods that produce modified versions of the string (substrings, trimmed versions, etc.) create modified copies of the original string.



回答3:

Strings are special cases. Each instance is immutable. When you change the value of a string you are allocating a new string in memory.

So only the reference is passed to your function, but when the string is edited it becomes a new instance and doesn't modify the old instance.



标签: