I would like to gather as much information as possible regarding API versioning in .NET/CLR, and specifically how API changes do or do not break client applications. First, let's define some terms:
API change - a change in the publicly visible definition of a type, including any of its public members. This includes changing type and member names, changing base type of a type, adding/removing interfaces from list of implemented interfaces of a type, adding/removing members (including overloads), changing member visibility, renaming method and type parameters, adding default values for method parameters, adding/removing attributes on types and members, and adding/removing generic type parameters on types and members (did I miss anything?). This does not include any changes in member bodies, or any changes to private members (i.e. we do not take into account Reflection).
Binary-level break - an API change that results in client assemblies compiled against older version of the API potentially not loading with the new version. Example: changing method signature, even if it allows to be called in the same way as before (ie: void to return type / parameter default values overloads).
Source-level break - an API change that results in existing code written to compile against older version of the API potentially not compiling with the new version. Already compiled client assemblies work as before, however. Example: adding a new overload that can result in ambiguity in method calls that were unambiguous previous.
Source-level quiet semantics change - an API change that results in existing code written to compile against older version of the API quietly change its semantics, e.g. by calling a different method. The code should however continue to compile with no warnings/errors, and previously compiled assemblies should work as before. Example: implementing a new interface on an existing class that results in a different overload being chosen during overload resolution.
The ultimate goal is to catalogize as many breaking and quiet semantics API changes as possible, and describe exact effect of breakage, and which languages are and are not affected by it. To expand on the latter: while some changes affect all languages universally (e.g. adding a new member to an interface will break implementations of that interface in any language), some require very specific language semantics to enter into play to get a break. This most typically involves method overloading, and, in general, anything having to do with implicit type conversions. There doesn't seem to be any way to define the "least common denominator" here even for CLS-conformant languages (i.e. those conforming at least to rules of "CLS consumer" as defined in CLI spec) - though I'll appreciate if someone corrects me as being wrong here - so this will have to go language by language. Those of most interest are naturally the ones that come with .NET out of the box: C#, VB and F#; but others, such as IronPython, IronRuby, Delphi Prism etc are also relevant. The more of a corner case it is, the more interesting it will be - things like removing members are pretty self-evident, but subtle interactions between e.g. method overloading, optional/default parameters, lambda type inference, and conversion operators can be very surprising at times.
A few examples to kickstart this:
Adding new method overloads
Kind: source-level break
Languages affected: C#, VB, F#
API before change:
public class Foo
{
public void Bar(IEnumerable x);
}
API after change:
public class Foo
{
public void Bar(IEnumerable x);
public void Bar(ICloneable x);
}
Sample client code working before change and broken after it:
new Foo().Bar(new int[0]);
Adding new implicit conversion operator overloads
Kind: source-level break.
Languages affected: C#, VB
Languages not affected: F#
API before change:
public class Foo
{
public static implicit operator int ();
}
API after change:
public class Foo
{
public static implicit operator int ();
public static implicit operator float ();
}
Sample client code working before change and broken after it:
void Bar(int x);
void Bar(float x);
Bar(new Foo());
Notes: F# is not broken, because it does not have any language level support for overloaded operators, neither explicit nor implicit - both have to be called directly as op_Explicit
and op_Implicit
methods.
Adding new instance methods
Kind: source-level quiet semantics change.
Languages affected: C#, VB
Languages not affected: F#
API before change:
public class Foo
{
}
API after change:
public class Foo
{
public void Bar();
}
Sample client code that suffers a quiet semantics change:
public static class FooExtensions
{
public void Bar(this Foo foo);
}
new Foo().Bar();
Notes: F# is not broken, because it does not have language level support for ExtensionMethodAttribute
, and requires CLS extension methods to be called as static methods.
API change:
Binary-level break:
Adding a new member (event protected) that uses a type from another assembly (Class2) as a template argument constraint.
Changing a child class (Class3) to derive from a type in another assembly when the class is used as a template argument for this class.
Source-level quiet semantics change:
(not sure where these fit)
Deployment changes:
Bootstrap/Configuration changes:
Update:
Sorry, I didn't realize that the only reason this was breaking for me was that I used them in template constraints.
This one was very non-obvious when I discovered it, especially in light of the difference with the same situation for interfaces. It's not a break at all, but it's surprising enough that I decided to include it:
Refactoring class members into a base class
Kind: not a break!
Languages affected: none (i.e. none are broken)
API before change:
API after change:
Sample code that keeps working throughout the change (even though I expected it to break):
Notes:
C++/CLI is the only .NET language that has a construct analogous to explicit interface implementation for virtual base class members - "explicit override". I fully expected that to result in the same kind of breakage as when moving interface members to a base interface (since IL generated for explicit override is the same as for explicit implementation). To my surprise, this is not the case - even though generated IL still specifies that
BarOverride
overridesFoo::Bar
rather thanFooBase::Bar
, assembly loader is smart enough to substitute one for another correctly without any complaints - apparently, the fact thatFoo
is a class is what makes the difference. Go figure...This one is a perhaps not-so-obvious special case of "adding/removing interface members", and I figured it deserves its own entry in light of another case which I'm going to post next. So:
Refactoring interface members into a base interface
Kind: breaks at both source and binary levels
Languages affected: C#, VB, C++/CLI, F# (for source break; binary one naturally affects any language)
API before change:
API after change:
Sample client code that is broken by change at source level:
Sample client code that is broken by change at binary level;
Notes:
For source level break, the problem is that C#, VB and C++/CLI all require exact interface name in the declaration of interface member implementation; thus, if the member gets moved to a base interface, the code will no longer compile.
Binary break is due to the fact that interface methods are fully qualified in generated IL for explicit implementations, and interface name there must also be exact.
Implicit implementation where available (i.e. C# and C++/CLI, but not VB) will work fine on both source and binary level. Method calls do not break either.
Convert an explicit interface implementation into an implicit one.
Kind of Break: Source
Languages Affected: All
The refactoring of an explicit interface implementation into an implicit one is more subtle in how it can break an API. On the surface, it would seem that this should be relatively safe, however, when combined with inheritance it can cause problems.
API Before Change:
API After Change:
Sample Client code that works before change and is broken afterwards:
Renaming an interface
Kinda of Break: Source and Binary
Languages Affected: Most likely all, tested in C#.
API Before Change:
API After Change:
Sample client code that works but is broken afterwards:
Adding a parameter with a default value.
Kind of Break: Binary-level break
Even if the calling source code doesn't need to change, it still needs to be recompiled (just like when adding a regular parameter).
That is because C# compiles the default values of the parameters directly into the calling assembly. It means that if you don't recompile, you will get a MissingMethodException because the old assembly tries to call a method with less arguments.
API Before Change
API After Change
Sample client code that is broken afterwards
The client code needs to be recompiled into
Foo(5, null)
at the bytecode level. The called assembly will only containFoo(int, string)
, notFoo(int)
. That's because default parameter values are purely a language feature, the .Net runtime does not know anything about them. (This also explain why default values have to be compile-time constants in C#).