Example:
// Potentially large struct.
struct Foo
{
public int A;
public int B;
// etc.
}
Foo[] arr = new Foo[100];
If Foo is a 100 byte structure, how many bytes will be copied in memory during execution of the following statement:
int x = arr[0].A
That is, is arr[0] evaluated to some temporary variable (a 100 byte copy of an instance of Foo), followed by the copying of .A into variable x (a 4 byte copy).
Or is some combination of the compiler, JITer and CLR able to optimise this statement such that the 4 bytes of A
are copied directly into x
.
If an optimisation is performed, does it still hold when the items are held in a List<Foo>
or when an array is passed as an IList<Foo>
or an ArraySegment<Foo>
?
The entire struct is already in memory. When you access
arr[0].A
, you aren't copying anything, and no new memory is needed. You're looking up an object reference (that might be on the call stack, but a struct might be wrapped by a reference type on the heap, too) for the location ofarr[0]
, adjusting for the offset for theA
property, and then accessing only that integer. There will not be a need to read the full struct just to get A.Neither
List<Foo>
orArraySegment<Foo>
really changes anything important here so far.However, if you were to pass
arr[0]
to a function or assign it to a new variable, that would result in copying theFoo
object. This is one difference between a struct (value type) and a class (reference type) in .Net; a class would only copy the reference, andList<Foo>
andArraySegment<Foo>
are both reference types.In .Net, especially as a newcomer the platform, you should strongly prefer
class
overstruct
most of the time, and it's not just about the copying the full object vs copying the reference. There are some other subtle semantic differences that even I admittedly don't fully understand. Just remember that class > struct until you have a good empirical reason to change your mind.Value types are copied by value -- hence the name. So then we must consider at what times a copy must be made of a value. This comes down to analyzing correctly when a particular entity refers to a variable, or a value. If it refers to a value then that value was copied from somewhere. If it refers to a variable then its just a variable, and can be treated like any other variable.
Suppose we have
Ignore for the moment the design flaws here; public fields are a bad code smell, as are mutable structs.
If you say
what happens? The spec says:
f
is created.temp
is created.temp
is filled in with eight bytes of zeros.temp
is copied tof
.But that is not what actually happens; the compiler and runtime are smart enough to notice that there is no observable difference between the required workflow and the workflow "create
f
and fill it with zeros", so that happens. This is a copy elision optimization.EXERCISE: devise a program in which the compiler cannot copy-elide, and the output makes it clear that the compiler does not perform a copy elision when initializing a variable of struct type.
Now if you say
then
f
is evaluated to produce a variable -- not a value -- and then from thatA
is evaluated to produce a variable, and four bytes are written to that variable.If you say
then f is evaluated as a variable,
A
is evaluated as a variable, and the value ofA
is written tox
.If you say
then variable
fs
is allocated, the array is allocated and initialized with zeros, and the reference to the array is copied tofs
. When you saySame as before.
f[0]
is evaluated as a variable, soA
is a variable, so 123 is copied to that variable.When you say
same as before: we evaluate
fs[0]
as a variable, fetch from that variable the value ofA
, and copy it.But if you say
then you will get a compiler error, because
list[0]
is a value, not a variable. You can't change it.If you say
then
list[0]
is evaluated as a value -- a copy of the value stored in the list is made -- and then a copy ofA
is made inx
. So there is an extra copy here.EXERCISE: Write a program that illustrates that
list[0]
is a copy of the value stored in the list.It is for this reason that you should (1) not make big structs, and (2) make them immutable. Structs get copied by value, which can be expensive, and values are not variables, so it is hard to mutate them.
Yes. Arrays are very special types that are built deeply into the runtime and have been since version 1.
The key feature here is that an array indexer logically produces an alias to the variable contained in the array; that alias can then be used as the variable itself.
All other indexers are actually pairs of get/set methods, where the get returns a value, not a variable.
Before C# 7, not in C#. You could do it in IL, but of course then C# wouldn't know what to do with the returned alias.
C# 7 adds the ability for methods to return aliases to variables:
ref
returns. Remember,ref
(andout
) parameters take variables as their operands and cause the callee to have an alias to that variable. C# 7 adds the ability to do this to locals and returns as well.