Is there a reason for C#'s reuse of the variab

2018-12-31 03:05发布

When using lambda expressions or anonymous methods in C#, we have to be wary of the access to modified closure pitfall. For example:

foreach (var s in strings)
{
   query = query.Where(i => i.Prop == s); // access to modified closure
   ...
}

Due to the modified closure, the above code will cause all of the Where clauses on the query to be based on the final value of s.

As explained here, this happens because the s variable declared in foreach loop above is translated like this in the compiler:

string s;
while (enumerator.MoveNext())
{
   s = enumerator.Current;
   ...
}

instead of like this:

while (enumerator.MoveNext())
{
   string s;
   s = enumerator.Current;
   ...
}

As pointed out here, there are no performance advantages to declaring a variable outside the loop, and under normal circumstances the only reason I can think of for doing this is if you plan to use the variable outside the scope of the loop:

string s;
while (enumerator.MoveNext())
{
   s = enumerator.Current;
   ...
}
var finalString = s;

However variables defined in a foreach loop cannot be used outside the loop:

foreach(string s in strings)
{
}
var finalString = s; // won't work: you're outside the scope.

So the compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.

Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable, or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?

5条回答
无色无味的生活
2楼-- · 2018-12-31 03:23

The compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.

Your criticism is entirely justified.

I discuss this problem in detail here:

Closing over the loop variable considered harmful

Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable? or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?

The latter. The C# 1.0 specification actually did not say whether the loop variable was inside or outside the loop body, as it made no observable difference. When closure semantics were introduced in C# 2.0, the choice was made to put the loop variable outside the loop, consistent with the "for" loop.

I think it is fair to say that all regret that decision. This is one of the worst "gotchas" in C#, and we are going to take the breaking change to fix it. In C# 5 the foreach loop variable will be logically inside the body of the loop, and therefore closures will get a fresh copy every time.

The for loop will not be changed, and the change will not be "back ported" to previous versions of C#. You should therefore continue to be careful when using this idiom.

查看更多
残风、尘缘若梦
3楼-- · 2018-12-31 03:34

What you are asking is thoroughly covered by Eric Lippert in his blog post Closing over the loop variable considered harmful and its sequel.

For me, the most convincing argument is that having new variable in each iteration would be inconsistent with for(;;) style loop. Would you expect to have a new int i in each iteration of for (int i = 0; i < 10; i++)?

The most common problem with this behavior is making a closure over iteration variable and it has an easy workaround:

foreach (var s in strings)
{
    var s_for_closure = s;
    query = query.Where(i => i.Prop == s_for_closure); // access to modified closure

My blog post about this issue: Closure over foreach variable in C#.

查看更多
刘海飞了
4楼-- · 2018-12-31 03:35

Having been bitten by this, I have a habit of including locally defined variables in the innermost scope which I use to transfer to any closure. In your example:

foreach (var s in strings)
{
    query = query.Where(i => i.Prop == s); // access to modified closure

I do:

foreach (var s in strings)
{
    string search = s;
    query = query.Where(i => i.Prop == search); // New definition ensures unique per iteration.

Once you have that habit, you can avoid it in the very rare case you actually intended to bind to the outer scopes. To be honest, I don't think I have ever done so.

查看更多
高级女魔头
5楼-- · 2018-12-31 03:36

In C# 5.0, this problem is fixed and you can close over loop variables and get the results you expect.

The language specification says:

8.8.4 The foreach statement

(...)

A foreach statement of the form

foreach (V v in x) embedded-statement

is then expanded to:

{
  E e = ((C)(x)).GetEnumerator();
  try {
      while (e.MoveNext()) {
          V v = (V)(T)e.Current;
          embedded-statement
      }
  }
  finally {
      … // Dispose e
  }
}

(...)

The placement of v inside the while loop is important for how it is captured by any anonymous function occurring in the embedded-statement. For example:

int[] values = { 7, 9, 13 };
Action f = null;
foreach (var value in values)
{
    if (f == null) f = () => Console.WriteLine("First value: " + value);
}
f();

If v was declared outside of the while loop, it would be shared among all iterations, and its value after the for loop would be the final value, 13, which is what the invocation of f would print. Instead, because each iteration has its own variable v, the one captured by f in the first iteration will continue to hold the value 7, which is what will be printed. (Note: earlier versions of C# declared v outside of the while loop.)

查看更多
只靠听说
6楼-- · 2018-12-31 03:36

In my opinion it's strange question. It's good to know how compiler works but that only "good to know".

If you write code that depends on algorithm of compiler it's bad practice.And it's better to rewrite code to exclude this dependency.

This is good question for job interview. But in real life I'v not faced with any problems that I solved on job interview.

The 90% of foreach uses for process each element of collection (not for select or calculate some values). sometimes you need to compute some values inside the loop but it's not good practice to create BIG loop.

It's better to use LINQ expressions for computing the values. Because when you calculate a lot of thing inside the loop, after 2-3 month when you (or anybody else) will read this code person will not understand what is this and how it should works.

查看更多
登录 后发表回答