I have a scope question regarding Linq expressions that are defined in a loop. The following LinqPad C# Program demonstrates the behaviour:
void Main()
{
string[] data=new string[] {"A1", "B1", "A2", "B2" };
string[] keys=new string[] {"A", "B" };
List<Result> results=new List<Result>();
foreach (string key in keys) {
IEnumerable<string> myData=data.Where (x => x.StartsWith(key));
results.Add(new Result() { Key=key, Data=myData});
}
results.Dump();
}
// Define other methods and classes here
class Result {
public string Key { get; set; }
public IEnumerable<string> Data { get; set; }
}
Basically, "A" should have data [A1, A2] and "B" should have data [B1, B2].
However, when you run this "A" gets data [B1, B2] as does B. I.e. the last expression is evaluated for all instances of Result.
Given that I declared "myData" inside the loop, why is it behaving as if I declared it outside the loop? EG it is acting like I would expect if I did this:
void Main()
{
string[] data=new string[] {"A1", "B1", "A2", "B2" };
string[] keys=new string[] {"A", "B" };
List<Result> results=new List<Result>();
IEnumerable<string> myData;
foreach (string key in keys) {
myData=data.Where (x => x.StartsWith(key));
results.Add(new Result() { Key=key, Data=myData});
}
results.Dump();
}
// Define other methods and classes here
class Result {
public string Key { get; set; }
public IEnumerable<string> Data { get; set; }
}
I get the desired result if I force the evaluation inside the iteration, that is not my question.
I'm asking why "myData" is seemingly shared across iterations given that I declared it within scope of a single iteration?
Somebody call Jon Skeet... ;^)
It's not
myData
which is being shared - it'skey
. And as the values withinmyData
are evaluated lazily, they depend on the current value ofkey
.It's behaving that way because the scope of the iteration variable is the whole loop, not each iteration of the loop. You've got a single
key
variable whose value changes, and it's the variable which is captured by the lambda expression.The correct fix is just to copy the iteration variable into a variable within the loop:
For more information about this problem, see Eric Lippert's blog post "Closing over the loop variable considered harmful": part one, part two.
It's an unfortunate artifact of the way the language has been designed, but changing it now would be a bad idea IMO. Although any code which changed behaviour would basically have been broken beforehand, it would mean that correct code in (say) C# 6 would be valid but incorrect code in C# 5, and that's a dangerous position to be in.