Does an IO monad make sense in a language like C#

2019-02-08 08:08发布

问题:

After spending a lot of time reading and thinking, I think I have finally grasped what monads are, how they work, and what they're useful for. My main goal was to figure out if monads were something I could apply to my daily work in C#.

When I started learning about monads, I got the impression that they are magical, and that they somehow make IO and other non-pure functions pure.

I understand the importance of monads for things like LINQ in .Net, and Maybe is very useful for dealing with functions that don't return valid values. And I also appreciate the need to limit statefulness in code and to isolate external dependencies, and I was hoping monads would help with those too.

But I've finally come to the conclusion that monads for IO and handling state are a necessity for Haskell, because Haskell has no other way to do it (otherwise, you couldn't guarantee sequencing, and some calls would be optimized away.) But for more mainstream languages, monads are not a good fit for these needs, since most languages already handle and state and IO easily.

So, my question is, is it fair to say that the IO monad is really only useful in Haskell? Is there a good reason to implement an IO monad in, say, C#?

回答1:

I use Haskell and F# regularly and I've never really felt like using an IO or state monad in F#.

The main reason for me is that in Haskell, you can tell from the type of something that it doesn't use IO or state, and that's a really valuable piece of information.

In F# (and C#) there's no such general expectation on other people's code, and so you won't benefit much from adding that discipline to your own code, and you'll pay some general overhead (mainly syntactic) for sticking to it.

Monads also don't work too well on the .NET platform because of the lack of higher-kinded types: while you can write monadic code in F# with workflow syntax, and in C# with a bit more pain, you can't easily write code that abstracts over multiple different monads.



回答2:

At work, we use monads to control IO in our C# code on our most important pieces of business logic. Two examples are our financial code and code that finds solutions to an optimization problem for our customers.

In our financial code, we use a monad to control IO writing to and reading from our database. It essentially consists of a small set of operations and an abstract syntax tree for the monad operations. You could imagine it's something like this (not actual code):

interface IFinancialOperationVisitor<T, out R> : IMonadicActionVisitor<T, R> {
    R GetTransactions(GetTransactions op);
    R PostTransaction(PostTransaction op);
}

interface IFinancialOperation<T> {
    R Accept<R>(IFinancialOperationVisitor<T, R> visitor);
}

class GetTransactions : IFinancialOperation<IError<IEnumerable<Transaction>>> {
    Account Account {get; set;};

    public R Accept<R>(IFinancialOperationVisitor<R> visitor) {
        return visitor.Accept(this);
    }
}

class PostTransaction : IFinancialOperation<IError<Unit>> {
    Transaction Transaction {get; set;};

    public R Accept<R>(IFinancialOperationVisitor<R> visitor) {
        return visitor.Accept(this);
    }
}

which is essentially the Haskell code

data FinancialOperation a where
     GetTransactions :: Account -> FinancialOperation (Either Error [Transaction])
     PostTransaction :: Transaction -> FinancialOperation (Either Error Unit)

along with an abstract syntax tree for the construction of actions in a monad, essentially the free monad:

interface IMonadicActionVisitor<in T, out R> {
    R Return(T value);
    R Bind<TIn>(IMonadicAction<TIn> input, Func<TIn, IMonadicAction<T>> projection);
    R Fail(Errors errors);
}    

// Objects to remember the arguments, and pass them to the visitor, just like above

/*
Hopefully I got the variance right on everything for doing this without higher order types, 
which is how we used to do this. We now use higher order types in c#, more on that below. 
Here, to avoid a higher-order type, the AST for monadic actions is included by inheritance 
in 
*/

In the real code, there are more of these so we can remember that something was built by .Select() instead of .SelectMany() for efficiency. A financial operation, including intermediary computations still has type IFinancialOperation<T>. The actual performance of the operations is done by an interpreter, which wraps all the database operations in a transaction and deals with how to roll that transaction back if any component is unsuccessful. We also use a interpreter for unit testing the code.

In our optimization code, we use a monad for controlling IO to get external data for optimization. This allows us to write code that is ignorant of how computations are composed, which lets us use exactly the same business code in multiple settings:

  • synchronous IO and computations for computations done on demand
  • asynchronous IO and computations for many computations done in parallel
  • mocked IO for unit tests

Since the code needs to be passed which monad to use, we need an explicit definition of a monad. Here's one. IEncapsulated<TClass,T> essentially means TClass<T>. This lets the c# compiler keep track of all three pieces of the type of monads simultaneously, overcoming the need to cast when dealing with monads themselves.

public interface IEncapsulated<TClass,out T>
{
    TClass Class { get; }
}

public interface IFunctor<F> where F : IFunctor<F>
{
    // Map
    IEncapsulated<F, B> Select<A, B>(IEncapsulated<F, A> initial, Func<A, B> projection);
}

public interface IApplicativeFunctor<F> : IFunctor<F> where F : IApplicativeFunctor<F>
{
    // Return / Pure
    IEncapsulated<F, A> Return<A>(A value);
    IEncapsulated<F, B> Apply<A, B>(IEncapsulated<F, Func<A, B>> projection, IEncapsulated<F, A> initial);
}

public interface IMonad<M> : IApplicativeFunctor<M> where M : IMonad<M>
{
    // Bind
    IEncapsulated<M, B> SelectMany<A, B>(IEncapsulated<M, A> initial, Func<A, IEncapsulated<M, B>> binding);
    // Bind and project 
    IEncapsulated<M, C> SelectMany<A, B, C>(IEncapsulated<M, A> initial, Func<A, IEncapsulated<M, B>> binding, Func<A, B, C> projection);
}

public interface IMonadFail<M,TError> : IMonad<M> {
    // Fail
    IEncapsulated<M, A> Fail<A>(TError error);
}

Now we could imagine making another class of monad for the portion of IO our computations need to be able to see:

public interface IMonadGetSomething<M> : IMonadFail<Error> {
    IEncapsulated<M, Something> GetSomething();
}

Then we can write code that doesn't know about how computations are put together

public class Computations {

    public IEncapsulated<M, IEnumerable<Something>> GetSomethings<M>(IMonadGetSomething<M> monad, int number) {
        var result = monad.Return(Enumerable.Empty<Something>());
        // Our developers might still like writing imperative code
        for (int i = 0; i < number; i++) {
            result = from existing in r1
                     from something in monad.GetSomething()
                     select r1.Concat(new []{something});
        }
        return result.Select(x => x.ToList());
    }
}

This can be reused in both a synchronous and asynchronous implementation of an IMonadGetSomething<>. Note that in this code, the GetSomething()s will happen one after another until there's an error, even in an asynchronous setting. (No this is not how we build lists in real life)



回答3:

You ask "Do we need an IO monad in C#?" but you should ask instead "Do we need a way to reliably obtain purity and immutability in C#?".

The key benefit would be controlling side-effects. Whether you do that using monads or some other mechanism doesn't matter. For example, C# could allow you to mark methods as pure and classes as immutable. That would go a great way towards taming side-effects.

In such a hypothetical version of C# you'd try to make 90% of the computation pure, and have unrestricted, eager IO and side-effects in the remaining 10%. In such a world I do not see so much of a need for absolute purity and an IO monad.

Note, that by just mechanically converting side-effecting code to a monadic style you gain nothing. The code does not improve in quality at all. You improve the code quality by being 90% pure, and concentrating the IO into small, easily reviewable places.



回答4:

The ability to know if a function has side effects just by looking at its signature is very useful when trying to understand what the function does. The less a function can do, the less you have to understand! (Polymorphism is another thing that helps restrict what a function can do with its arguments.)

In many languages that implement Software Transactional Memory, the documentation has warnings like the following:

I/O and other activities with side-effects should be avoided in transactions, since transactions will be retried.

Having that warning become a prohibition enforced by the type system can make the language safer.

There are optimizations can only be performed with code that is free of side effects. But the absence of side effects may be difficult to determine if you "allow anything" in the first place.

Another benefit of the IO monad is that, since IO actions are "inert" unless they lie in the path of the main function, it's easy to manipulate them as data, put them in containers, compose them at runtime, and so on.

Of course, the monadic approach to IO has its disadvantages. But it does have advantages besides "being one of the few ways of doing I/O in a pure lazy language in a flexible and principled manner".



回答5:

As always, the IO monad is special and difficult to reason about. It's well known in the Haskell community that while IO is useful, it does not share many of the benefits other monads do. It's use is, as you've remarked, motivated greatly by its privileges position instead of it being a good modeling tool.

With that, I'd say it's not so useful in C# or, really, any language that isn't trying to completely contain side effects with type annotations.

But it's just one monad. As you've mentioned, Failure shows up in LINQ, but more sophisticated monads are useful even in a side-effecting language.

For instance, even with arbitrary global and local state environments, the state monad will indicate both the beginning and end of a regime of actions which work on some privileged kind of state. You don't get the side-effect elimination guarantees Haskell enjoys, but you still get good documentation.

To go further, introducing something like a Parser monad is a favorite example of mine. Having that monad, even in C#, is a great way to localize things like non-deterministic, backtracking failure performed while consuming a string. You can obviously do that with particular kinds of mutability, but Monads express that a particular expression performs a useful action in that effectful regime without regard to any global state you might also be involving.

So, I'd say yes, they're useful in any typed language. But IO as Haskell does it? Maybe not so much.



回答6:

In a language like C# where you can do IO anywhere, an IO monad doesn't really have any practical use. The only thing you'd want to use it for is controlling side effects, and since there's nothing stopping you from performing side effects outside the monad, there's not really much point.

As for the Maybe monad, while it seems potentially useful, it only really works in a language with lazy evaluation. In the following Haskell expression, the second lookup isn't evaluated if the first returns Nothing:

doSomething :: String -> Maybe Int
doSomething name = do
    x <- lookup name mapA
    y <- lookup name mapB
    return (x+y)

This allows the expression to "short circuit" when a Nothing is encountered. An implementation in C# would have to perform both lookups (I think, I'd be interested to see a counter-example.) You're probably better-off with if statements.

Another issue is the loss of abstraction. While it's certainly possible to implement monads in C# (or things which look a little bit like monads), you can't really generalise like you can in Haskell because C# doesn't have higher kinds. For example, a function like mapM :: Monad m => Monad m => (a -> m b) -> [a] -> m [b] (which works for any monad) can't really be represented in C#. You could certainly have something like this:

public List<Maybe<a> mapM<a,b>(Func<a, Maybe<b>>);

which would work for a specific monad (Maybe in this case), but It's not possible to abstract-away the Maybe from that function. You'd have to be able to do something like this:

public List<m<a> mapM<m,a,b>(Func<a, m<b>>);

which isn't possible in C#.