A definitive guide to API-breaking changes in .NET

2019-01-01 14:07发布

问题:

I would like to gather as much information as possible regarding API versioning in .NET/CLR, and specifically how API changes do or do not break client applications. First, let\'s define some terms:

API change - a change in the publicly visible definition of a type, including any of its public members. This includes changing type and member names, changing base type of a type, adding/removing interfaces from list of implemented interfaces of a type, adding/removing members (including overloads), changing member visibility, renaming method and type parameters, adding default values for method parameters, adding/removing attributes on types and members, and adding/removing generic type parameters on types and members (did I miss anything?). This does not include any changes in member bodies, or any changes to private members (i.e. we do not take into account Reflection).

Binary-level break - an API change that results in client assemblies compiled against older version of the API potentially not loading with the new version. Example: changing method signature, even if it allows to be called in the same way as before (ie: void to return type / parameter default values overloads).

Source-level break - an API change that results in existing code written to compile against older version of the API potentially not compiling with the new version. Already compiled client assemblies work as before, however. Example: adding a new overload that can result in ambiguity in method calls that were unambiguous previous.

Source-level quiet semantics change - an API change that results in existing code written to compile against older version of the API quietly change its semantics, e.g. by calling a different method. The code should however continue to compile with no warnings/errors, and previously compiled assemblies should work as before. Example: implementing a new interface on an existing class that results in a different overload being chosen during overload resolution.

The ultimate goal is to catalogize as many breaking and quiet semantics API changes as possible, and describe exact effect of breakage, and which languages are and are not affected by it. To expand on the latter: while some changes affect all languages universally (e.g. adding a new member to an interface will break implementations of that interface in any language), some require very specific language semantics to enter into play to get a break. This most typically involves method overloading, and, in general, anything having to do with implicit type conversions. There doesn\'t seem to be any way to define the \"least common denominator\" here even for CLS-conformant languages (i.e. those conforming at least to rules of \"CLS consumer\" as defined in CLI spec) - though I\'ll appreciate if someone corrects me as being wrong here - so this will have to go language by language. Those of most interest are naturally the ones that come with .NET out of the box: C#, VB and F#; but others, such as IronPython, IronRuby, Delphi Prism etc are also relevant. The more of a corner case it is, the more interesting it will be - things like removing members are pretty self-evident, but subtle interactions between e.g. method overloading, optional/default parameters, lambda type inference, and conversion operators can be very surprising at times.

A few examples to kickstart this:

Adding new method overloads

Kind: source-level break

Languages affected: C#, VB, F#

API before change:

public class Foo
{
    public void Bar(IEnumerable x);
}

API after change:

public class Foo
{
    public void Bar(IEnumerable x);
    public void Bar(ICloneable x);
}

Sample client code working before change and broken after it:

new Foo().Bar(new int[0]);

Adding new implicit conversion operator overloads

Kind: source-level break.

Languages affected: C#, VB

Languages not affected: F#

API before change:

public class Foo
{
    public static implicit operator int ();
}

API after change:

public class Foo
{
    public static implicit operator int ();
    public static implicit operator float ();
}

Sample client code working before change and broken after it:

void Bar(int x);
void Bar(float x);
Bar(new Foo());

Notes: F# is not broken, because it does not have any language level support for overloaded operators, neither explicit nor implicit - both have to be called directly as op_Explicit and op_Implicit methods.

Adding new instance methods

Kind: source-level quiet semantics change.

Languages affected: C#, VB

Languages not affected: F#

API before change:

public class Foo
{
}

API after change:

public class Foo
{
    public void Bar();
}

Sample client code that suffers a quiet semantics change:

public static class FooExtensions
{
    public void Bar(this Foo foo);
}

new Foo().Bar();

Notes: F# is not broken, because it does not have language level support for ExtensionMethodAttribute, and requires CLS extension methods to be called as static methods.

回答1:

Changing a method signature

Kind: Binary-level Break

Languages affected: C# (VB and F# most likely, but untested)

API before change

public static class Foo
{
    public static void bar(int i);
}

API after change

public static class Foo
{
    public static bool bar(int i);
}

Sample client code working before change

Foo.bar(13);


回答2:

Adding a parameter with a default value.

Kind of Break: Binary-level break

Even if the calling source code doesn\'t need to change, it still needs to be recompiled (just like when adding a regular parameter).

That is because C# compiles the default values of the parameters directly into the calling assembly. It means that if you don\'t recompile, you will get a MissingMethodException because the old assembly tries to call a method with less arguments.

API Before Change

public void Foo(int a) { }

API After Change

public void Foo(int a, string b = null) { }

Sample client code that is broken afterwards

Foo(5);

The client code needs to be recompiled into Foo(5, null) at the bytecode level. The called assembly will only contain Foo(int, string), not Foo(int). That\'s because default parameter values are purely a language feature, the .Net runtime does not know anything about them. (This also explain why default values have to be compile-time constants in C#).



回答3:

This one was very non-obvious when I discovered it, especially in light of the difference with the same situation for interfaces. It\'s not a break at all, but it\'s surprising enough that I decided to include it:

Refactoring class members into a base class

Kind: not a break!

Languages affected: none (i.e. none are broken)

API before change:

class Foo
{
    public virtual void Bar() {}
    public virtual void Baz() {}
}

API after change:

class FooBase
{
    public virtual void Bar() {}
}

class Foo : FooBase
{
    public virtual void Baz() {}
}

Sample code that keeps working throughout the change (even though I expected it to break):

// C++/CLI
ref class Derived : Foo
{
   public virtual void Baz() {{

   // Explicit override    
   public virtual void BarOverride() = Foo::Bar {}
};

Notes:

C++/CLI is the only .NET language that has a construct analogous to explicit interface implementation for virtual base class members - \"explicit override\". I fully expected that to result in the same kind of breakage as when moving interface members to a base interface (since IL generated for explicit override is the same as for explicit implementation). To my surprise, this is not the case - even though generated IL still specifies that BarOverride overrides Foo::Bar rather than FooBase::Bar, assembly loader is smart enough to substitute one for another correctly without any complaints - apparently, the fact that Foo is a class is what makes the difference. Go figure...



回答4:

This one is a perhaps not-so-obvious special case of \"adding/removing interface members\", and I figured it deserves its own entry in light of another case which I\'m going to post next. So:

Refactoring interface members into a base interface

Kind: breaks at both source and binary levels

Languages affected: C#, VB, C++/CLI, F# (for source break; binary one naturally affects any language)

API before change:

interface IFoo
{
    void Bar();
    void Baz();
}

API after change:

interface IFooBase 
{
    void Bar();
}

interface IFoo : IFooBase
{
    void Baz();
}

Sample client code that is broken by change at source level:

class Foo : IFoo
{
   void IFoo.Bar() { ... }
   void IFoo.Baz() { ... }
}

Sample client code that is broken by change at binary level;

(new Foo()).Bar();

Notes:

For source level break, the problem is that C#, VB and C++/CLI all require exact interface name in the declaration of interface member implementation; thus, if the member gets moved to a base interface, the code will no longer compile.

Binary break is due to the fact that interface methods are fully qualified in generated IL for explicit implementations, and interface name there must also be exact.

Implicit implementation where available (i.e. C# and C++/CLI, but not VB) will work fine on both source and binary level. Method calls do not break either.



回答5:

Reordering enumerated values

Kind of break: Source-level/Binary-level quiet semantics change

Languages affected: all

Reordering enumerated values will keep source-level compatibility as literals have the same name, but their ordinal indices will be updated, which can cause some kinds of silent source-level breaks.

Even worse is the silent binary-level breaks that can be introduced if client code is not recompiled against the new API version. Enum values are compile-time constants and as such any uses of them are baked into the client assembly\'s IL. This case can be particularly hard to spot at times.

API Before Change

public enum Foo
{
   Bar,
   Baz
}

API After Change

public enum Foo
{
   Baz,
   Bar
}

Sample client code that works but is broken afterwards:

Foo.Bar < Foo.Baz


回答6:

This one is really a very rare thing in practice, but nonetheless a surprising one when it happens.

Adding new non-overloaded members

Kind: source level break or quiet semantics change.

Languages affected: C#, VB

Languages not affected: F#, C++/CLI

API before change:

public class Foo
{
}

API after change:

public class Foo
{
    public void Frob() {}
}

Sample client code that is broken by change:

class Bar
{
    public void Frob() {}
}

class Program
{
    static void Qux(Action<Foo> a)
    {
    }

    static void Qux(Action<Bar> a)
    {
    }

    static void Main()
    {
        Qux(x => x.Frob());        
    }
}

Notes:

The problem here is caused by lambda type inference in C# and VB in presence of overload resolution. A limited form of duck typing is employed here to break ties where more than one type matches, by checking whether the body of the lambda makes sense for a given type - if only one type results in compilable body, that one is chosen.

The danger here is that client code may have an overloaded method group where some methods take arguments of his own types, and others take arguments of types exposed by your library. If any of his code then relies on type inference algorithm to determine the correct method based solely on presence or absence of members, then adding a new member to one of your types with the same name as in one of the client\'s types can potentially throw inference off, resulting in ambiguity during overload resolution.

Note that types Foo and Bar in this example are not related in any way, not by inheritance nor otherwise. Mere use of them in a single method group is enough to trigger this, and if this occurs in client code, you have no control over it.

The sample code above demonstrates a simpler situation where this is a source-level break (i.e. compiler error results). However, this can also be a silent semantics change, if the overload that was chosen via inference had other arguments which would otherwise cause it to be ranked below (e.g. optional arguments with default values, or type mismatch between declared and actual argument requiring an implicit conversion). In such scenario, the overload resolution will no longer fail, but a different overload will be quietly selected by the compiler. In practice, however, it is very hard to run into this case without carefully constructing method signatures to deliberately cause it.



回答7:

Convert an implicit interface implementation into an explicit one.

Kind of Break: Source and Binary

Languages Affected: All

This is really just a variation of changing a method\'s accessibility - its just a little more subtle since it\'s easy to overlook the fact that not all access to an interface\'s methods are necessarily through a reference to the type of the interface.

API Before Change:

public class Foo : IEnumerable
{
    public IEnumerator GetEnumerator();
}

API After Change:

public class Foo : IEnumerable
{
    IEnumerator IEnumerable.GetEnumerator();
}

Sample Client code that works before change and is broken afterwards:

new Foo().GetEnumerator(); // fails because GetEnumerator() is no longer public


回答8:

Convert an explicit interface implementation into an implicit one.

Kind of Break: Source

Languages Affected: All

The refactoring of an explicit interface implementation into an implicit one is more subtle in how it can break an API. On the surface, it would seem that this should be relatively safe, however, when combined with inheritance it can cause problems.

API Before Change:

public class Foo : IEnumerable
{
    IEnumerator IEnumerable.GetEnumerator() { yield return \"Foo\"; }
}

API After Change:

public class Foo : IEnumerable
{
    public IEnumerator GetEnumerator() { yield return \"Foo\"; }
}

Sample Client code that works before change and is broken afterwards:

class Bar : Foo, IEnumerable
{
    IEnumerator IEnumerable.GetEnumerator() // silently hides base instance
    { yield return \"Bar\"; }
}

foreach( var x in new Bar() )
    Console.WriteLine(x);    // originally output \"Bar\", now outputs \"Foo\"


回答9:

Changing a field to a property

Kind of Break: API

Languages Affected: Visual Basic and C#*

Info: When you change a normal field or variable into a property in visual basic, any outside code referencing that member in any way will need to be recompiled.

API Before Change:

Public Class Foo    
    Public Shared Bar As String = \"\"    
End Class

API After Change:

Public Class Foo
    Private Shared _Bar As String = \"\"
    Public Shared Property Bar As String
        Get
            Return _Bar
        End Get
        Set(value As String)
            _Bar = value
        End Set
    End Property
End Class    

Sample client code that works but is broken afterwards :

Foo.Bar = \"foobar\"


回答10:

Namespace Addition

Source-level break / Source-level quiet semantics change

Due to the way namespace resolution works in vb.Net, adding a namespace to a library can cause Visual Basic code that compiled with a previous version of the API to not compile with a new version.

Sample client code:

Imports System
Imports Api.SomeNamespace

Public Class Foo
    Public Sub Bar()
        Dim dr As Data.DataRow
    End Sub
End Class

If a new version of the API adds the namespace Api.SomeNamespace.Data, then the above code will not compile.

It becomes more complicated with project-level namespace imports. If Imports System is omitted from the above code, but the System namespace is imported at the project level, then the code may still result in an error.

However, if the Api includes a class DataRow in its Api.SomeNamespace.Data namespace, then the code will compile but dr will be an instance of System.Data.DataRow when compiled with the old version of the API and Api.SomeNamespace.Data.DataRow when compiled with the new version of the API.

Argument Renaming

Source-level break

Changing the names of arguments is a breaking change in vb.net from version 7(?) (.Net version 1?) and c#.net from version 4 (.Net version 4).

API before change:

namespace SomeNamespace {
    public class Foo {
        public static void Bar(string x) {
           ...
        }
    }
}

API after change:

namespace SomeNamespace {
    public class Foo {
        public static void Bar(string y) {
           ...
        }
    }
}

Sample client code:

Api.SomeNamespace.Foo.Bar(x:\"hi\"); //C#
Api.SomeNamespace.Foo.Bar(x:=\"hi\") \'VB

Ref Parameters

Source-level break

Adding a method override with the same signature except that one parameter is passed by reference instead of by value will cause vb source that references the API to be unable to resolve the function. Visual Basic has no way(?) to differentiate these methods at the call point unless they have different argument names, so such a change could cause both members to be unusable from vb code.

API before change:

namespace SomeNamespace {
    public class Foo {
        public static void Bar(string x) {
           ...
        }
    }
}

API after change:

namespace SomeNamespace {
    public class Foo {
        public static void Bar(string x) {
           ...
        }
        public static void Bar(ref string x) {
           ...
        }
    }
}

Sample client code:

Api.SomeNamespace.Foo.Bar(str)

Field to Property Change

Binary-level break/Source-level break

Besides the obvious binary-level break, this can cause a source-level break if the member is passed to a method by reference.

API before change:

namespace SomeNamespace {
    public class Foo {
        public int Bar;
    }
}

API after change:

namespace SomeNamespace {
    public class Foo {
        public int Bar { get; set; }
    }
}

Sample client code:

FooBar(ref Api.SomeNamespace.Foo.Bar);


回答11:

API change:

  1. Adding the [Obsolete] attribute (you kinda covered this with mentioning attributes; however, this can be a breaking change when using warning-as-error.)

Binary-level break:

  1. Moving a type from one assembly to another
  2. Changing the namespace of a type
  3. Adding a base class type from another assembly.
  4. Adding a new member (event protected) that uses a type from another assembly (Class2) as a template argument constraint.

    protected void Something<T>() where T : Class2 { }
    
  5. Changing a child class (Class3) to derive from a type in another assembly when the class is used as a template argument for this class.

    protected class Class3 : Class2 { }
    protected void Something<T>() where T : Class3 { }
    

Source-level quiet semantics change:

  1. Adding/removing/changing overrides of Equals(), GetHashCode(), or ToString()

(not sure where these fit)

Deployment changes:

  1. Adding/removing dependencies/references
  2. Updating dependencies to newer versions
  3. Changing the \'target platform\' between x86, Itanium, x64, or anycpu
  4. Building/testing on a different framework install (i.e. installing 3.5 on a .Net 2.0 box allows API calls that then require .Net 2.0 SP2)

Bootstrap/Configuration changes:

  1. Adding/Removing/Changing custom configuration options (i.e. App.config settings)
  2. With the heavy use of IoC/DI in todays applications, it\'s somethings necessary to reconfigure and/or change bootstrapping code for DI dependent code.

Update:

Sorry, I didn\'t realize that the only reason this was breaking for me was that I used them in template constraints.



回答12:

Adding overload methods to demise default parameters usage

Kind of break: Source-level quiet semantics change

Because the compiler transforms method calls with missing default parameter values to an explicit call with the default value on the calling side, compatibility for existing compiled code is given; a method with the correct signature will be found for all previously compiled code.

On the other side, calls without usage of optional parameters are now compiled as a call to the new method that is missing the optional parameter. It all is still working fine, but if the called code resides in another assembly, newly compiled code calling it is now dependent to the new version of this assembly. Deploying assemblies calling the refactored code without also deploying the assembly the refactored code resides in is resulting in \"method not found\" exceptions.

API before change

  public int MyMethod(int mandatoryParameter, int optionalParameter = 0)
  {
     return mandatoryParameter + optionalParameter;
  }    

API after change

  public int MyMethod(int mandatoryParameter, int optionalParameter)
  {
     return mandatoryParameter + optionalParameter;
  }

  public int MyMethod(int mandatoryParameter)
  {
     return MyMethod(mandatoryParameter, 0);
  }

Sample code that will still be working

  public int CodeNotDependentToNewVersion()
  {
     return MyMethod(5, 6); 
  }

Sample code that is now dependent to the new version when compiling

  public int CodeDependentToNewVersion()
  {
     return MyMethod(5); 
  }


回答13:

Renaming an interface

Kinda of Break: Source and Binary

Languages Affected: Most likely all, tested in C#.

API Before Change:

public interface IFoo
{
    void Test();
}

public class Bar
{
    IFoo GetFoo() { return new Foo(); }
}

API After Change:

public interface IFooNew // Of the exact same definition as the (old) IFoo
{
    void Test();
}

public class Bar
{
    IFooNew GetFoo() { return new Foo(); }
}

Sample client code that works but is broken afterwards:

new Bar().GetFoo().Test(); // Binary only break
IFoo foo = new Bar().GetFoo(); // Source and binary break