Scope of Linq Expressions defined in a loop - At i

2019-07-23 14:33发布

I have a scope question regarding Linq expressions that are defined in a loop. The following LinqPad C# Program demonstrates the behaviour:

void Main()
{
    string[] data=new string[] {"A1", "B1", "A2", "B2" };
    string[] keys=new string[] {"A", "B" };

    List<Result> results=new List<Result>();

    foreach (string key in keys) {
        IEnumerable<string> myData=data.Where (x => x.StartsWith(key));     
        results.Add(new Result() { Key=key, Data=myData});          
    }   
    results.Dump();
}

// Define other methods and classes here
class Result {
    public string Key { get; set; }
    public IEnumerable<string> Data { get; set; }
}

Basically, "A" should have data [A1, A2] and "B" should have data [B1, B2].

However, when you run this "A" gets data [B1, B2] as does B. I.e. the last expression is evaluated for all instances of Result.

Given that I declared "myData" inside the loop, why is it behaving as if I declared it outside the loop? EG it is acting like I would expect if I did this:

void Main()
{
    string[] data=new string[] {"A1", "B1", "A2", "B2" };
    string[] keys=new string[] {"A", "B" };

    List<Result> results=new List<Result>();

    IEnumerable<string> myData;                 
    foreach (string key in keys) {
        myData=data.Where (x => x.StartsWith(key));     
        results.Add(new Result() { Key=key, Data=myData});          
    }   
    results.Dump();
}

// Define other methods and classes here
class Result {
    public string Key { get; set; }
    public IEnumerable<string> Data { get; set; }
}

I get the desired result if I force the evaluation inside the iteration, that is not my question.

I'm asking why "myData" is seemingly shared across iterations given that I declared it within scope of a single iteration?

Somebody call Jon Skeet... ;^)

1条回答
我命由我不由天
2楼-- · 2019-07-23 14:57

It's not myData which is being shared - it's key. And as the values within myData are evaluated lazily, they depend on the current value of key.

It's behaving that way because the scope of the iteration variable is the whole loop, not each iteration of the loop. You've got a single key variable whose value changes, and it's the variable which is captured by the lambda expression.

The correct fix is just to copy the iteration variable into a variable within the loop:

foreach (string key in keys) {
    String keyCopy = key;
    IEnumerable<string> myData = data.Where (x => x.StartsWith(keyCopy));     
    results.Add(new Result() { Key = key, Data = myData});
}

For more information about this problem, see Eric Lippert's blog post "Closing over the loop variable considered harmful": part one, part two.

It's an unfortunate artifact of the way the language has been designed, but changing it now would be a bad idea IMO. Although any code which changed behaviour would basically have been broken beforehand, it would mean that correct code in (say) C# 6 would be valid but incorrect code in C# 5, and that's a dangerous position to be in.

查看更多
登录 后发表回答