Why are lambda expressions not “interned”?

2019-02-08 10:48发布

Strings are reference types, but they are immutable. This allows for them to be interned by the compiler; everywhere the same string literal appears, the same object may be referenced.

Delegates are also immutable reference types. (Adding a method to a multicast delegate using the += operator constitutes assignment; that's not mutability.) And, like, strings, there is a "literal" way to represent a delegate in code, using a lambda expression, e.g.:

Func<int> func = () => 5;

The right-hand side of that statement is an expression whose type is Func<int>; but nowhere am I explicitly invoking the Func<int> constructor (nor is an implicit conversion happening). So I view this as essentially a literal. Am I mistaken about my definition of "literal" here?

Regardless, here's my question. If I have two variables for, say, the Func<int> type, and I assign identical lambda expressions to both:

Func<int> x = () => 5;
Func<int> y = () => 5;

...what's preventing the compiler from treating these as the same Func<int> object?

I ask because section 6.5.1 of the C# 4.0 language specification clearly states:

Conversions of semantically identical anonymous functions with the same (possibly empty) set of captured outer variable instances to the same delegate types are permitted (but not required) to return the same delegate instance. The term semantically identical is used here to mean that execution of the anonymous functions will, in all cases, produce the same effects given the same arguments.

This surprised me when I read it; if this behavior is explicitly allowed, I would have expected for it to be implemented. But it appears not to be. This has in fact gotten a lot of developers into trouble, esp. when lambda expressions have been used to attach event handlers successfully without being able to remove them. For example:

class EventSender
{
    public event EventHandler Event;
    public void Send()
    {
        EventHandler handler = this.Event;
        if (handler != null) { handler(this, EventArgs.Empty); }
    }
}

class Program
{
    static string _message = "Hello, world!";

    static void Main()
    {
        var sender = new EventSender();
        sender.Event += (obj, args) => Console.WriteLine(_message);
        sender.Send();

        // Unless I'm mistaken, this lambda expression is semantically identical
        // to the one above. However, the handler is not removed, indicating
        // that a different delegate instance is constructed.
        sender.Event -= (obj, args) => Console.WriteLine(_message);

        // This prints "Hello, world!" again.
        sender.Send();
    }
}

Is there any reason why this behavior—one delegate instance for semantically identical anonymous methods—is not implemented?

6条回答
看我几分像从前
2楼-- · 2019-02-08 11:14

Lambdas can't be interned because they use an object to contain the captured local variables. And this instance is different every time you you construct the delegate.

查看更多
Emotional °昔
3楼-- · 2019-02-08 11:18

In general, there is no distinction between string variables which refer to the same String instance, versus two variables which refer to different strings that happen to contain the same sequence of characters. The same is not true of delegates.

Suppose I have two delegates, assigned to two different lambda expressions. I then subscribe both delegates to an event handler, and unsubscribe one. What should be the result?

It would be useful if there were a way in vb or C# to designate that an anonymous method or lambda which does not make reference to Me/this should be regarded as a static method, producing a single delegate which could be reused throughout the life of the application. There is no syntax to indicate that, however, and for the compiler to decide to have different lambda expressions return the same instance would be a potentially breaking change.

EDIT I guess the spec allows it, even though it could be a potentially breaking change if any code were to rely upon the instances being distinct.

查看更多
你好瞎i
4楼-- · 2019-02-08 11:19

The other answers bring up good points. Mine really doesn't have to do with anything technical - Every feature starts out with -100 points.

查看更多
放我归山
5楼-- · 2019-02-08 11:20

You're mistaken to call it a literal, IMO. It's just an expression which is convertible to a delegate type.

Now as for the "interning" part - some lambda expressions are cached , in that for one single lambda expression, sometimes a single instance can be created and reused however often that line of code is encountered. Some are not treated that way: it usually depends on whether the lambda expression captures any non-static variables (whether that's via "this" or local to the method).

Here's an example of this caching:

using System;

class Program
{
    static void Main()
    {
        Action first = GetFirstAction();
        first -= GetFirstAction();
        Console.WriteLine(first == null); // Prints True

        Action second = GetSecondAction();
        second -= GetSecondAction();
        Console.WriteLine(second == null); // Prints False
    }

    static Action GetFirstAction()
    {
        return () => Console.WriteLine("First");
    }

    static Action GetSecondAction()
    {
        int i = 0;
        return () => Console.WriteLine("Second " + i);
    }
}

In this case we can see that the first action was cached (or at least, two equal delegates were produced, and in fact Reflector shows that it really is cached in a static field). The second action created two unequal instances of Action for the two calls to GetSecondAction, which is why "second" is non-null at the end.

Interning lambdas which appear in different places in the code but with the same source code is a different matter. I suspect it would be quite complex to do this properly (after all, the same source code can mean different things in different places) and I would certainly not want to rely on it taking place. If it's not going to be worth relying on, and it's a lot of work to get right for the compiler team, I don't think it's the best way they could be spending their time.

查看更多
小情绪 Triste *
6楼-- · 2019-02-08 11:21

This is allowed because the C# team cannot control this. They heavily rely on the implementation details of delegates (CLR + BCL) and the JIT compiler's optimizer. There already is quite a proliferation of CLR and jitter implementations right now and there is little reason to assume that's going to end. The CLI spec is very light on rules about delegates, not nearly strong enough to ensure all these different teams will end up with an implementation that guarantees that delegate object equality is consistent. Not in the least because that would hamper future innovation. There's lots to optimize here.

查看更多
小情绪 Triste *
7楼-- · 2019-02-08 11:26

Is there any reason why this behavior—one delegate instance for semantically identical anonymous methods—is not implemented?

Yes. Because spending time on a tricky optimization that benefits almost no one takes time away from designing, implementing, testing and maintaining features that do benefit people.

查看更多
登录 后发表回答