Breaking out of a nested loop

2019-01-01 02:20发布

站内文章 / C#

84 0

怪性笑人.

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

If I have a for loop which is nested within another, how can I efficiently come out of both loops (inner and outer) in the quickest possible way?

I don\'t want to have to use a boolean and then have to say go to another method, but rather just to execute the first line of code after the outer loop.

What is a quick and nice way of going about this?

Thanks

I was thinking that exceptions aren\'t cheap/should only be thrown in a truly exceptional condition etc. Hence I don\'t think this solution would be good from a performance perspective.

I don\'t feel it it is right to take advantage of the newer features in .NET (anon methods) to do something which is pretty fundamental.

Because of that, tvon (sorry can\'t spell full username!) has a nice solution.

Marc: Nice use of anon methods, and this too is great but because I could be in a job where we don\'t use a version of .NET/C# that supports anon methods, I need to know a traditional approach too.

回答1:

Well, goto, but that is ugly, and not always possible. You can also place the loops into a method (or an anon-method) and use return to exit back to the main code.

    // goto
    for (int i = 0; i < 100; i++)
    {
        for (int j = 0; j < 100; j++)
        {
            goto Foo; // yeuck!
        }
    }
Foo:
    Console.WriteLine(\"Hi\");

vs:

// anon-method
Action work = delegate
{
    for (int x = 0; x < 100; x++)
    {
        for (int y = 0; y < 100; y++)
        {
            return; // exits anon-method
        }
    }
};
work(); // execute anon-method
Console.WriteLine(\"Hi\");

Note that in C# 7 we should get \"local functions\", which (syntax tbd etc) means it should work something like:

// local function (declared **inside** another method)
void Work()
{
    for (int x = 0; x < 100; x++)
    {
        for (int y = 0; y < 100; y++)
        {
            return; // exits local function
        }
    }
};
Work(); // execute local function
Console.WriteLine(\"Hi\");

回答2:

C# adaptation of approach often used in C - set value of outer loop\'s variable outside of loop conditions (i.e. for loop using int variable INT_MAX -1 is often good choice):

for (int i = 0; i < 100; i++)
{
    for (int j = 0; j < 100; j++)
    {
        if (exit_condition)
        {
            // cause the outer loop to break:
            // use i = INT_MAX - 1; otherwise i++ == INT_MIN < 100 and loop will continue 
            i = int.MaxValue - 1;
            Console.WriteLine(\"Hi\");
            // break the inner loop
            break;
        }
    }
    // if you have code in outer loop it will execute after break from inner loop    
}

As note in code says break will not magically jump to next iteration of the outer loop - so if you have code outside of inner loop this approach requires more checks. Consider other solutions in such case.

This approach works with for and while loops but does not work for foreach. In case of foreach you won\'t have code access to the hidden enumerator so you can\'t change it (and even if you could IEnumerator doesn\'t have some \"MoveToEnd\" method).

_{Acknowledgments to inlined comments\' authors:

i = INT_MAX - 1 suggestion by Meta

for/foreach comment by ygoe.

Proper IntMax by jmbpiano

remark about code after inner loop by blizpasta}

回答3:

This solution does not apply to C#

For people who found this question via other languages, Javascript, Java, and D allows labeled breaks and continues:

outer: while(fn1())
{
   while(fn2())
   {
     if(fn3()) continue outer;
     if(fn4()) break outer;
   }
}

回答4:

Use a suitable guard in the outer loop. Set the guard in the inner loop before you break.

bool exitedInner = false;

for (int i = 0; i < N && !exitedInner; ++i) {

    .... some outer loop stuff

    for (int j = 0; j < M; ++j) {

        if (sometest) {
            exitedInner = true;
            break;
        }
    }
    if (!exitedInner) {
       ... more outer loop stuff
    }
}

Or better yet, abstract the inner loop into a method and exit the outer loop when it returns false.

for (int i = 0; i < N; ++i) {

    .... some outer loop stuff

    if (!doInner(i, N, M)) {
       break;
    }

    ... more outer loop stuff
}

回答5:

Don\'t quote me on this, but you could use goto as suggested in the MSDN. There are other solutions, as including a flag that is checked in each iteration of both loops. Finally you could use an exception as a really heavyweight solution to your problem.

GOTO:

for ( int i = 0; i < 10; ++i ) {
   for ( int j = 0; j < 10; ++j ) {
      // code
      if ( break_condition ) goto End;
      // more code
   }
}
End: ;

Condition:

bool exit = false;
for ( int i = 0; i < 10 && !exit; ++i ) {
   for ( int j = 0; j < 10 && !exit; ++j ) {
      // code
      if ( break_condition ) {
         exit = true;
         break; // or continue
      }
      // more code
   }
}

Exception:

try {
    for ( int i = 0; i < 10 && !exit; ++i ) {
       for ( int j = 0; j < 10 && !exit; ++j ) {
          // code
          if ( break_condition ) {
             throw new Exception()
          }
          // more code
       }
    }
catch ( Exception e ) {}

回答6:

Is it possible to refactor the nested for loop into a private method? That way you could simply \'return\' out of the method to exit the loop.

回答7:

factor into a function/method and use early return, or rearrange your loops into a while-clause. goto/exceptions/whatever are certainly not appropriate here.

def do_until_equal():
  foreach a:
    foreach b:
      if a==b: return

回答8:

You asked for a combination of quick, nice, no use of a boolean, no use of goto, and C#. You\'ve ruled out all possible ways of doing what you want.

The most quick and least ugly way is to use a goto.

回答9:

It seems to me like people dislike a goto statement a lot, so I felt the need to straighten this out a bit.

I believe the \'emotions\' people have about goto eventually boil down to understanding of code and (misconceptions) about possible performance implications. Before answering the question, I will therefore first go into some of the details on how it\'s compiled.

As we all know, C# is compiled to IL, which is then compiled to assembler using an SSA compiler. I\'ll give a bit of insights into how this all works, and then try to answer the question itself.

From C# to IL

First we need a piece of C# code. Let\'s start simple:

foreach (var item in array)
{
    // ... 
    break;
    // ...
}

I\'ll do this step by step to give you a good idea of what happens under the hood.

First translation: from foreach to the equivalent for loop (Note: I\'m using an array here, because I don\'t want to get into details of IDisposable -- in which case I\'d also have to use an IEnumerable):

for (int i=0; i<array.Length; ++i)
{
    var item = array[i];
    // ...
    break;
    // ...
}

Second translation: the for and break is translated into an easier equivalent:

int i=0;
while (i < array.Length)
{
    var item = array[i];
    // ...
    break;
    // ...
    ++i;
}

And third translation (this is the equivalent of the IL code): we change break and while into a branch:

    int i=0; // for initialization

startLoop:
    if (i >= array.Length) // for condition
    {
        goto exitLoop;
    }
    var item = array[i];
    // ...
    goto exitLoop; // break
    // ...
    ++i;           // for post-expression
    goto startLoop;

While the compiler does these things in a single step, it gives you insight into the process. The IL code that evolves from the C# program is the literal translation of the last C# code. You can see for yourself here: https://dotnetfiddle.net/QaiLRz (click \'view IL\')

Now, one thing you have observed here is that during the process, the code becomes more complex. The easiest way to observe this is by the fact that we needed more and more code to ackomplish the same thing. You might also argue that foreach, for, while and break are actually short-hands for goto, which is partly true.

From IL to Assembler

The .NET JIT compiler is an SSA compiler. I won\'t go into all the details of SSA form here and how to create an optimizing compiler, it\'s just too much, but can give a basic understanding about what will happen. For a deeper understanding, it\'s best to start reading up on optimizing compilers (I do like this book for a brief introduction: http://ssabook.gforge.inria.fr/latest/book.pdf ) and LLVM (llvm.org).

Every optimizing compiler relies on the fact that code is easy and follows predictable patterns. In the case of FOR loops, we use graph theory to analyze branches, and then optimize things like cycli in our branches (e.g. branches backwards).

However, we now have forward branches to implement our loops. As you might have guessed, this is actually one of the first steps the JIT is going to fix, like this:

    int i=0; // for initialization

    if (i >= array.Length) // for condition
    {
        goto endOfLoop;
    }

startLoop:
    var item = array[i];
    // ...
    goto endOfLoop; // break
    // ...
    ++i;           // for post-expression

    if (i >= array.Length) // for condition
    {
        goto startLoop;
    }

endOfLoop:
    // ...

As you can see, we now have a backward branch, which is our little loop. The only thing that\'s still nasty here is the branch that we ended up with due to our break statement. In some cases, we can move this in the same way, but in others it\'s there to stay.

So why does the compiler do this? Well, if we can unroll the loop, we might be able to vectorize it. We might even be able to proof that there\'s just constants being added, which means our whole loop could vanish into thin air. To summarize: by making the patterns predictable (by making the branches predictable), we can proof that certain conditions hold in our loop, which means we can do magic during the JIT optimization.

However, branches tend to break those nice predictable patterns, which is something optimizers therefore kind-a dislike. Break, continue, goto - they all intend to break these predictable patterns- and are therefore not really \'nice\'.

You should also realize at this point that a simple foreach is more predictable then a bunch of goto statements that go all over the place. In terms of (1) readability and (2) from an optimizer perspective, it\'s both the better solution.

Another thing worth mentioning is that it\'s very relevant for optimizing compilers to assign registers to variables (a process called register allocation). As you might know, there\'s only a finite number of registers in your CPU and they are by far the fastest pieces of memory in your hardware. Variables used in code that\'s in the inner-most loop, are more likely to get a register assigned, while variables outside of your loop are less important (because this code is probably hit less).

Help, too much complexity... what should I do?

The bottom line is that you should always use the language constructs you have at your disposal, which will usually (implictly) build predictable patterns for your compiler. Try to avoid strange branches if possible (specifically: break, continue, goto or a return in the middle of nothing).

The good news here is that these predictable patterns are both easy to read (for humans) and easy to spot (for compilers).

One of those patterns is called SESE, which stands for Single Entry Single Exit.

And now we get to the real question.

Imagine that you have something like this:

// a is a variable.

for (int i=0; i<100; ++i) 
{
  for (int j=0; j<100; ++j)
  {
     // ...

     if (i*j > a) 
     {
        // break everything
     }
  }
}

The easiest way to make this a predictable pattern is to simply eliminate the if completely:

int i, j;
for (i=0; i<100 && i*j <= a; ++i) 
{
  for (j=0; j<100 && i*j <= a; ++j)
  {
     // ...
  }
}

In other cases you can also split the method into 2 methods:

// Outer loop in method 1:

for (i=0; i<100 && processInner(i); ++i) 
{
}

private bool processInner(int i)
{
  int j;
  for (j=0; j<100 && i*j <= a; ++j)
  {
     // ...
  }
  return i*j<=a;
}

Temporary variables? Good, bad or ugly?

You might even decide to return a boolean from within the loop (but I personally prefer the SESE form because that\'s how the compiler will see it and I think it\'s cleaner to read).

Some people think it\'s cleaner to use a temporary variable, and propose a solution like this:

bool more = true;
for (int i=0; i<100; ++i) 
{
  for (int j=0; j<100; ++j) 
  {
     // ...
     if (i*j > a) { more = false; break; } // yuck.
     // ...
  }
  if (!more) { break; } // yuck.
  // ...
}
// ...

I personally am opposed to this approach. Look again on how the code is compiled. Now think about what this will do with these nice, predictable patterns. Get the picture?

Right, let me spell it out. What will happen is that:

The compiler will write out everything as branches.
As an optimization step, the compiler will do data flow analysis in an attempt to remove the strange more variable that only happens to be used in control flow.
If succesful, the variable more will be eliminated from the program, and only branches remain. These branches will be optimized, so you will get only a single branch out of the inner loop.
If unsuccesful, the variable more is definitely used in the inner-most loop, so if the compiler won\'t optimize it away, it has a high chance to be allocated to a register (which eats up valuable register memory).

So, to summarize: the optimizer in your compiler will go into a hell of a lot of trouble to figure out that more is only used for the control flow, and in the best case scenario will translate it to a single branch outside of the outer for loop.

In other words, the best case scenario is that it will end up with the equivalent of this:

for (int i=0; i<100; ++i) 
{
  for (int j=0; j<100; ++j)
  {
     // ...
     if (i*j > a) { goto exitLoop; } // perhaps add a comment
     // ...
  }
  // ...
}
exitLoop:

// ...

My personal opinion on this is quite simple: if this is what we intended all along, let\'s make the world easier for both the compiler and readability, and write that right away.

tl;dr:

Bottom line:

Use a simple condition in your for loop if possible. Stick to the high-level language constructs you have at your disposal as much as possible.
If everything fails and you\'re left with either goto or bool more, prefer the former.

回答10:

Sometimes nice to abstract the code into it\'s own function and than use an early return - early returns are evil though : )

public void GetIndexOf(Transform transform, out int outX, out int outY)
{
    outX = -1;
    outY = -1;

    for (int x = 0; x < Columns.Length; x++)
    {
        var column = Columns[x];

        for (int y = 0; y < column.Transforms.Length; y++)
        {
            if(column.Transforms[y] == transform)
            {
                outX = x;
                outY = y;

                return;
            }
        }
    }
}

回答11:

Depending on your situation, you may be able to do this, but only if your not executing code AFTER the inner loop.

for (int i = 0; i < 100; i++)
{
    for (int j = 0; j < 100; j++)
    {
        i = 100;
        break;
    }
}

It\'s not elegent, but it may be the easiest solution depending on your problem.

回答12:

I\'ve seen a lot of examples that use \"break\" but none that use \"continue\".

It still would require a flag of some sort in the inner loop:

while( some_condition )
{
    // outer loop stuff
    ...

    bool get_out = false;
    for(...)
    {
        // inner loop stuff
        ...

        get_out = true;
        break;
    }

    if( get_out )
    {
        some_condition=false;
        continue;
    }

    // more out loop stuff
    ...

}

回答13:

Since I first saw break in C a couple of decades back, this problem has vexed me. I was hoping some language enhancement would have an extension to break which would work thus:

break; // our trusty friend, breaks out of current looping construct.
break 2; // breaks out of the current and it\'s parent looping construct.
break 3; // breaks out of 3 looping constructs.
break all; // totally decimates any looping constructs in force.

回答14:

I remember from my student days that it was said it\'s mathematically provable that you can do anything in code without a goto (i.e. there is no situation where goto is the only answer). So, I never use goto\'s (just my personal preference, not suggesting that i\'m right or wrong)

Anyways, to break out of nested loops I do something like this:

var isDone = false;
for (var x in collectionX) {
    for (var y in collectionY) {
        for (var z in collectionZ) {
            if (conditionMet) {
                // some code
                isDone = true;
            }
            if (isDone)
                break;
        }
        if (isDone) 
            break;
    }
    if (isDone)
        break;
}

... i hope that helps for those who like me are anti-goto \"fanboys\" :)

回答15:

That\'s how I did it. Still a workaround.

foreach (var substring in substrings) {
  //To be used to break from 1st loop.
  int breaker=1;
  foreach (char c in substring) {
    if (char.IsLetter(c)) {
      Console.WriteLine(line.IndexOf(c));
      \\\\setting condition to break from 1st loop.
      breaker=9;
      break;
    }
  }
  if (breaker==9) {
    break;
  }
}

回答16:

Throw a custom exception which goes out outter loop.

It works for for,foreach or while or any kind of loop and any language that uses try catch exception block

try 
{
   foreach (object o in list)
   {
      foreach (object another in otherList)
      {
         // ... some stuff here
         if (condition)
         {
            throw new CustomExcpetion();
         }
      }
   }
}
catch (CustomException)
{
   // log 
}

回答17:

         bool breakInnerLoop=false
        for(int i=0;i<=10;i++)
        {
          for(int J=0;i<=10;i++)
          {
              if(i<=j)
                {
                    breakInnerLoop=true;
                    break;
                }
          }
            if(breakInnerLoop)
            {
            continue
            }
        }

回答18:

As i see you accepted the answer in which the person refers you goto statement, where in modern programming and in expert opinion goto is a killer, we called it a killer in programming which have some certain reasons, which i will not discuss it over here at this point, but the solution of your question is very simple, you can use a Boolean flag in this kind of scenario like i will demonstrate it in my example:

            for (; j < 10; j++)
            {
                //solution
                bool breakme = false;
                for (int k = 1; k < 10; k++)
                {
                   //place the condition where you want to stop it
                    if ()
                    {
                        breakme = true;
                        break;
                    }
                }

                if(breakme)
                    break;
               }

simple and plain. :)

回答19:

Did you even look at the break keyword? O.o

This is just pseudo-code, but you should be able to see what I mean:

<?php
for(...) {
    while(...) {
        foreach(...) {
            break 3;
        }
    }
}

If you think about break being a function like break(), then it\'s parameter would be the number of loops to break out of. As we are in the third loop in the code here, we can break out of all three.

Manual: http://php.net/break

回答20:

I think unless you want to do the \"boolean thing\" the only solution is actually to throw. Which you obviously shouldn\'t do..!

标签： c# for-loop nested-loops