ReadOnlyCollection vs Liskov - How to correctly mo

2019-04-18 12:41发布

问题:

Liskov-substitution principle requires that subtypes must satisfy the contracts of super-types. In my understanding, this would entail that ReadOnlyCollection<T> violates Liskov. ICollection<T>'s contract exposes Add and Remove operations, but the read only subtype does not satisfy this contract. For example,

IList<object> collection = new List<object>();
collection = new System.Collections.ObjectModel.ReadOnlyCollection<object>(collection);
collection.Add(new object());

    -- not supported exception

There is clearly a need for immutable collections. Is there something broken about .NET's way of modeling them? What would be the better way to do it? IEnumerable<T> does a good job of exposing a collection while, at least, appearing to be immutable. However, the semantics are very different, primarily because IEnumerable doesn't explicitly expose any of state.

In my particular case, I am trying to build an immutable DAG class to support an FSM. I will obviously need AddNode / AddEdge methods at the beginning but I don't want it to be possible to change the state machine once it is already running. I'm having difficulty representing the similarity between the immutable and mutable representations of the DAG.

Right now, my design involves using a DAG Builder up front, and then creating the immutable graph once, at which point it is no longer editable. The only common interface between the Builder and the concrete immutable DAG is an Accept(IVisitor visitor). I'm concerned that this may be over-engineered / too abstract in the face of possibly simpler options. At the same time, I'm having trouble accepting that I can expose methods on the my graph interface that may throw NotSupportedException if the client gets a particular implementation. What is the right way to handle this?

回答1:

You could always have a (read-only) graph interface, and extend it with a read/write modifiable-graph interface:

public interface IDirectedAcyclicGraph
{
    int GetNodeCount();
    bool GetConnected(int from, int to);
}

public interface IModifiableDAG : IDirectedAcyclicGraph
{
    void SetNodeCount(int nodeCount);
    void SetConnected(int from, int to, bool connected);
}

(I can't figure out how to split these methods into get/set halves of a property.)

// Rubbish implementation
public class ConcreteModifiableDAG : IModifiableDAG
{
    private int nodeCount;
    private Dictionary<int, Dictionary<int, bool>> connections;

    public void SetNodeCount(int nodeCount) {
        this.nodeCount = nodeCount;
    }

    public void SetConnected(int from, int to, bool connected) {
        connections[from][to] = connected;
    }

    public int GetNodeCount() {
        return nodeCount;
    }

    public bool GetConnected(int from, int to) {
        return connections[from][to];
    }
}

// Create graph
IModifiableDAG mdag = new ConcreteModifiableDAG();
mdag.SetNodeCount(5);
mdag.SetConnected(1, 5, true);

// Pass fixed graph
IDirectedAcyclicGraph dag = (IDirectedAcyclicGraph)mdag;
dag.SetNodeCount(5);          // Doesn't exist
dag.SetConnected(1, 5, true); // Doesn't exist

This is what I wish Microsoft had done with their read-only collection classes - made one interface for get-count, get-by-index behaviour etc., and extend it with an interface to add, change values etc.



回答2:

I don't think that your current solution with the builder is overengineered.

It solves two problems:

  1. Violation of LSP
    You have an editable interface whose implementations will never throw NotSupportedExceptions on AddNode / AddEdge and you have a non-editable interface that doesn't have these methods at all.

  2. Temporal coupling
    If you would go with one interface instead of two, that one interface would need to somehow support the "initialization phase" and the "immutable phase", most likely by some methods marking the start and possibly end of those phases.



回答3:

Read only collections in .Net do not go against LSP.

You seem bothered by the read only collection throwing a not supported exception if the add method is called, but there is nothing exceptional about it.

A lot of classes represent domain objects that can be in one of several states and not every operation is valid in all states: streams can only be opened once, windows cannot be shown after they are disposed, etc..

Throwing exceptions in those cases is valid as long as there is a way to test the current state and avoid the exceptions.

The .Net collections were engineered to support the states: read-only and read/write. Which is why the method IsReadWrite is present. It allows callers to test the state of the collection and avoid exceptions.

LSP requires subtypes to honor the contract of the super type, but a contract is more than just a list of methods; it is a list of inputs and expected behavior based on the state of the object:

"If you give me this input, when I'm in this state expect this to happen."

ReadOnlyCollection fully honors the contract of ICollection by throwing a not supported exception when the state of the collection is read only. See the exceptions section in the ICollection documentation.



回答4:

You can use explict interface implementations to separate your modification methods from the operations needed in the read-only version. Also on your read-only implementation have a method that takes a method as an argument. This allows you to isolate your building of the DAC from the navigation and querying. see the code below and its comments:

// your read only operations and the
// method that allows for building
public interface IDac<T>
{
    IDac<T> Build(Action<IModifiableDac<T>> f);
    // other navigation methods
}

// modifiable operations, its still an IDac<T>
public interface IModifiableDac<T> : IDac<T>
{
    void AddEdge(T item);
    IModifiableDac<T> CreateChildNode();
}

// implementation explicitly implements IModifableDac<T> so
// accidental calling of modification methods won't happen
// (an explicit cast to IModifiable<T> is required)
public class Dac<T> : IDac<T>, IModifiableDac<T>
{
    public IDac<T> Build(Action<IModifiableDac<T>> f)
    {
        f(this);
        return this;
    }

    void IModifiableDac<T>.AddEdge(T item)
    {
        throw new NotImplementedException();
    }

    public IModifiableDac<T> CreateChildNode() {
        // crate, add, child and return it
        throw new NotImplementedException();
    }

    public void DoStuff() { }
}

public class DacConsumer
{
    public void Foo()
    {
        var dac = new Dac<int>();
        // build your graph
        var newDac = dac.Build(m => {
            m.AddEdge(1);
            var node = m.CreateChildNode();
            node.AddEdge(2);
            //etc.
        });

        // now do what ever you want, IDac<T> does not have modification methods
        newDac.DoStuff();
    }
}

From this code, the user can only call Build(Action<IModifiable<T>> m) to get access to a modifiable version. and the method call returns an immutable one. There is no way to access it as IModifiable<T> without an intentional explicit cast, which isn't defined in the contract for your object.



回答5:

The way I like it (but maybe that's just me), is to have the reading methods in an interface and the editing methods in the class itself. For your DAG, it is highly unlikely that you will have multiple implementations of the data structure, so having an interface to edit the graph is kind of an overkill and usually not very pretty.

I find having the class representing the data-structure and the interface being the reading structure pretty clean.

for instance:

public interface IDAG<out T>
{
    public int NodeCount { get; }
    public bool AreConnected(int from, int to);
    public T GetItem(int node);
}

public class DAG<T> : IDAG<T>
{
    public void SetCount(...) {...}
    public void SetEdge(...) {...}
    public int NodeCount { get {...} }
    public bool AreConnected(...) {...}
    public T GetItem(...) {...}
}

Then, when you require editing the structure, you pass the class, if you just need the readonly structure, pass the interface. It's a fake 'read-only' because you can always cast as the class, but read-only is never real anyway...

This allows you to have more complex reading structure. As in Linq, you can then extend your reading structure with extension methods defined on the interface. For instance:

public static class IDAGExtensions
{
    public static List<T> FindPathBetween(this IDAG<T> dag, int from, int to)
    {
        // Use backtracking to determine if a path exists between `from` and `to`
    }

    public static IDAG<U> Cast<U>(this IDAG<T> dag)
    {
        // Create a wrapper for the DAG class that casts all T outputs as U
    }
}

This is extremely useful to separate the definition of the datastructure from 'what you can do with it'.

The other thing that this structure allows is to set the generic type as out T. That allows you to have contravariance of argument types.



回答6:

I like the idea of designing my data structures immutable in the first place. Sometimes it's just not feasible but there's a way to accomplish this quite often.

For your DAG you most probably have some data structure in a file or a user interface and you could pass all the nodes and edges as IEnumerables to your immutable DAG class' constructor. Then you can use the Linq methods to transform your source data to nodes and edges.

The constructor (or a factory method) can then build the class' private structures in a way that's efficient for your algorithm and do upfront data validations like acyclicy.

This solution distinguishes from the builder pattern in a way that iterative construction of the data structure is not possible but often that's not really required.

Personally, I don't like the solutions with separate interfaces for read and read/write access implemented by the same class because the write functionality is not really hidden... casting the instance to the read/write interface exposes the mutating methods. The better solution in such a scenario is having an AsReadOnly method that creates a really immutable data structure copying the data.