When writing a switch statement, there appears to be two limitations on what you can switch on in case statements.
For example (and yes, I know, if you're doing this sort of thing it probably means your object-oriented (OO) architecture is iffy - this is just a contrived example!),
Type t = typeof(int);
switch (t) {
case typeof(int):
Console.WriteLine("int!");
break;
case typeof(string):
Console.WriteLine("string!");
break;
default:
Console.WriteLine("unknown!");
break;
}
Here the switch() statement fails with 'A value of an integral type expected' and the case statements fail with 'A constant value is expected'.
Why are these restrictions in place, and what is the underlying justification? I don't see any reason why the switch statement has to succumb to static analysis only, and why the value being switched on has to be integral (that is, primitive). What is the justification?
I suppose there is no fundamental reason why the compiler couldn't automatically translate your switch statement into:
But there isn't much gained by that.
A case statement on integral types allows the compiler to make a number of optimizations:
There is no duplication (unless you duplicate case labels, which the compiler detects). In your example t could match multiple types due to inheritance. Should the first match be executed? All of them?
The compiler can choose to implement a switch statement over an integral type by a jump table to avoid all the comparisons. If you are switching on an enumeration that has integer values 0 to 100 then it creates an array with 100 pointers in it, one for each switch statement. At runtime it simply looks up the address from the array based on the integer value being switched on. This makes for much better runtime performance than performing 100 comparisons.
Since the language allows the string type to be used in a switch statement I presume the compiler is unable to generate code for a constant time branch implementation for this type and needs to generate an if-then style.
@mweerden - Ah I see. Thanks.
I do not have a lot of experience in C# and .NET but it seems the language designers do not allow static access to the type system except in narrow circumstances. The typeof keyword returns an object so this is accessible at run-time only.
Mostly, those restrictions are in place because of language designers. The underlying justification may be compatibility with languange history, ideals, or simplification of compiler design.
The compiler may (and does) choose to:
The switch statement IS NOT a constant time branch. The compiler may find short-cuts (using hash buckets, etc), but more complicated cases will generate more complicated MSIL code with some cases branching out earlier than others.
To handle the String case, the compiler will end up (at some point) using a.Equals(b) (and possibly a.GetHashCode() ). I think it would be trival for the compiler to use any object that satisfies these constraints.
As for the need for static case expressions... some of those optimisations (hashing, caching, etc) would not be available if the case expressions weren't deterministic. But we've already seen that sometimes the compiler just picks the simplistic if-else-if-else road anyway...
Edit: lomaxx - Your understanding of the "typeof" operator is not correct. The "typeof" operator is used to obtain the System.Type object for a type (nothing to do with its supertypes or interfaces). Checking run-time compatibility of an object with a given type is the "is" operator's job. The use of "typeof" here to express an object is irrelevant.
While on the topic, according to Jeff Atwood, the switch statement is a programming atrocity. Use them sparingly.
You can often accomplish the same task using a table. For example:
I have virtually no knowledge of C#, but I suspect that either switch was simply taken as it occurs in other languages without thinking about making it more general or the developer decided that extending it was not worth it.
Strictly speaking you are absolutely right that there is no reason to put these restrictions on it. One might suspect that the reason is that for the allowed cases the implementation is very efficient (as suggested by Brian Ensink (44921)), but I doubt the implementation is very efficient (w.r.t. if-statements) if I use integers and some random cases (e.g. 345, -4574 and 1234203). And in any case, what is the harm in allowing it for everything (or at least more) and saying that it is only efficient for specific cases (such as (almost) consecutive numbers).
I can, however, imagine that one might want to exclude types because of reasons such as the one given by lomaxx (44918).
Edit: @Henk (44970): If Strings are maximally shared, strings with equal content will be pointers to the same memory location as well. Then, if you can make sure that the strings used in the cases are stored consecutively in memory, you can very efficiently implement the switch (i.e. with execution in the order of 2 compares, an addition and two jumps).
By the way, VB, having the same underlying architecture, allows much more flexible
Select Case
statements (the above code would work in VB) and still produces efficient code where this is possible so the argument by techical constraint has to be considered carefully.