I am trying to understand the following excerpt from an official blog post about new features in C# 7.0 concerned with ref returns.
You can only return refs that are “safe to return”: Ones that were passed to you, and ones that point into fields in objects.
Ref locals are initialized to a certain storage location, and cannot be mutated to point to another.
Unfortunately, the blog post does not give any code example. Would greatly appreciate it if someone could shed more light into the restrictions highlighted in bold with practical examples and an explanation.
Thanks in advance.
To pass something by reference, it must be classified as variable. C# specification (§5 Variables) define seven categories of variables: static variables, instance variables, array elements, value parameters, reference parameters, output parameters and local variables.
The first point actually saying that you can ref return variables classified as reference parameters, output parameters, static variables and instance variables.
Note that for instance fields of value types, you should consider "safe to return" status of enclosing variable. It is not always allowed, as in case for instance fields of reference types:
Result of ref return method considered "safe to return", if this method does not take any not "safe to return" arguments:
Although we know that
ReturnFirst(ref array[0], ref localVariable)
will return "safe to return" reference (ref array[0]
), compiler can not infer it by analyzingTest4
method in isolation. So, result ofReturnFirst
method in that case considered as not "safe to return".The second point says, that ref local variables declaration must include initializer:
Also, ref local variable can not be reassigned to point to other storage location:
Actually there is no valid syntax to reassign ref local variable.
You've got some answers that clarify the restriction, but not the reasoning behind the restriction.
The reasoning behind the restriction is that we must never allow an alias to a dead variable. If you have an ordinary local in an ordinary method, and you return a ref to it, then the local is dead by the time the ref is used.
Now, one might point out that a local that is returned by ref could be hoisted to a field of a closure class. Yes, that would solve the problem. But the point of the feature is to allow developers to write high-performance close-to-the-machine low-cost mechanisms, and automatically hoisting to a closure -- and then taking on the burdens of collection pressure and so on -- work against that goal.
Things can get slightly tricky. Consider:
Here we are returning a ref to local z in a sneaky manner! This also has to be illegal. But now consider this:
Now we can know that the returned ref is not the ref to
z
. So can we say that this is legal if the types of the refs passed in are all different than the types of the refs returned?What about this?
Now once again we have returned a ref to a dead variable.
When we designed this feature for the first time (in 2010 IIRC) there were a number of complicated proposals to deal with these situations, but my favourite proposal was simply "make all of them illegal". You don't get to return a reference that you got returned by a ref-returning method, even if there is no way it could be dead.
I don't know what rule the C# 7 team ended up implementing.
You can find a great discussion about this feature at GitHub - Proposal: Ref Returns and Locals.
The following example shows the return of a safe reference because it cames from the caller:
Conversely, a non-safe version of this example would be returning a reference to a local (this code would not compile):
The restriction means you need to initialize a local reference always at declaration. A declaration like
would not compile. You also are not possible to assign an new reference to an already existing reference like
The other answers on this page are complete and useful, but I wanted to add an additional point, which is that
out
parameters, which your function is required to fully initialize, count as "safe to return" for the purposes of ref return.Interestingly, combining this fact with another new C# 7 feature, inline declaration of 'out' variables, allows for the simulation of a general-purpose inline declaration of local variables capability:
helper function:
With this helper, the caller specifies the initialization of the "inline local variable" by assigning to the ref-return of the helper.
To demonstrate the helper, next is an example of a simple two-level comparison function which would be typical for an (e.g.)
MyObj.IComparable<MyObj>.Compare
implementation. Although very simple, this type of expression can't get around needing a single local variable--without duplicating work, that is. Now normally, needing a local would block using an expression-bodied member which is what we'd like to do here, but the problem is easily solved with the above helper:Walkthrough: Local variable
d
is "inline-declared," and initialized with the result of computing the first-level compare, based on the offs fields. If this result is inconclusive, we fall back to returning a second level sort (based on the size fields). But in the alternative, we do still have the first-level result available to return, since it was saved in locald
.Note that the above can also be done without the helper function, via C# 7 pattern matching:
include at the top of your source files:
The following examples show declaring a local variable inline with its initialization in C# 7. If initialization is not provided, it obtains
default(T)
, as assigned by thelocal<T>(out T t)
helper function. This is only now possible with theref return
feature, sinceref return
methods are the only methods can be used as an ℓ-value.example 1:
example 2:
The first example trivially assigns from an integer literal. In the second example, the inline local
dlg
is assigned from a constructor (new
expression), and then the entire assignment expression is used for its resolved value to call an instance method (ShowDialog
) on the newly created instance. For precise clarity as a standalone example, it finishes by showing that the referred instance ofdlg
did indeed need to be named as a variable, in order to fetch one of its properties.[edit:] Regarding...
...it would certainly be nice to have a
ref
variable with a mutable referent, since this would help avoid expensive indexing bounds checks within loop bodies. Of course, that's also precisely why it's not allowed. You probably can't get around this (i.e.ref
to an array access expression with indexing containingref
won't work; it gets permanently resolved to the element at the referenced position when initialized) but if it helps, note that you can take aref
to a pointer, and this includes ref local:The point of this admittedly pointless example code is that, although we didn't alter ref local variable
rpi
itself in any way (since 'ya can't), it does now have a different (ultimate) referent.More seriously, what ref local does now allow for, as far as tightening up the IL in array-indexing loop bodies, is a technique I call value-type stamping. This allows for efficient IL in loop bodies which need to access multiple fields of each element in an array of value-types. Typically, this has been a trade-off between external initialization (
newobj
/initobj
) followed by a single indexing access versus in-situ non-initialization but with the expense of redundant multiple runtime indexing.With value-type stamping however, now we can entirely avoid per-element
initobj
/newobj
IL instructions and also have just a single indexing computation at runtime. I'll show the example first, and then describe the technique in general below.The example shows a concise yet extreme use of the value-type stamping technique; you can discern its twist (given away in a comment) on your own if you're interested. In what follows, I'll instead discuss the value-type stamping technique in more general terms.
First, prepare ref locals with references directly to the relevant fields in a staging instance of the value-type. This can be either on the stack, or, as shown in the example, co-opted from the last-to-be-processed element of the target array itself. It may be valuable to have a
ref
to the entire staging instance as well, especially if using the co-opting technique.Each iteration of the loop body can then prepare the staging instance very efficiently, and as a final step when ready, "stamp" it wholesale into the array with only a single indexing operation. Of course, if the final element of the array was co-opted as the staging instance, then you can also leave the loop slightly earlier.