Code blocks Ver.16.01 crashing during run cycle of

2019-09-16 00:56发布

问题:

I have a program which has been proved to run on an older version of codeblocks (ver 13.12) but does not seem to work when I try it on the newer version (ver 16.01). The purpose of the programme is to enter two integers which will then be added, mult etc. It uses asm code which I am new at. My question is why does it say windows has stopped responding after I type 2 integers and press enter?

Here is the code:

//Program 16

#include <stdio.h>
 #include <iostream>
 using namespace std;

int main() {

int arg1, arg2, add, sub, mul, quo, rem ;

cout << "Enter two integer numbers : " ;
cin >>  arg1 >> arg2 ;
cout << endl;

  asm ( "addl %%ebx, %%eax;" : "=a" (add) : "a" (arg1) , "b" (arg2) );
  asm ( "subl %%ebx, %%eax;" : "=a" (sub) : "a" (arg1) , "b" (arg2) );
 asm ( "imull %%ebx, %%eax;" : "=a" (mul) : "a" (arg1) , "b" (arg2) );

asm ( "movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;" : "=a" (quo), "=d" (rem) : "g" (arg1), "g" (arg2) );

cout<< arg1 << "+" << arg2 << " = " << add << endl;
 cout<< arg1 << "-" << arg2 << " = " << sub << endl;
cout<< arg1 << "x" << arg2 << " = " << mul << endl;
cout<< arg1 << "/" << arg2 << " = " << quo << "  ";
 cout<< "remainder " << rem << endl;

return 0;
}

回答1:

As Michael has said, your problem probably comes from your 4th asm statement being written incorrectly.

The first thing you need to understand when writing inline asm is what registers are and how they are used. Registers are a fundamental concept in x86 assembler programming, so if you don't know what they are, it's time for you to find an x86 assembly language primer.

Once you've got that, you need to understand that when compiler runs, it is using those registers in the code it generates. For example if you do for (int x=0; x<10; x++), x is (probably) going to end up in a register. So what happens if gcc decides to use ebx to hold the value of 'x', and then your asm statement stomps on ebx, putting some other value in it? gcc doesn't 'parse' your asm to figure out what you are doing. The only clue it has about what your asm does are those constraints listed after the asm instructions.

That's what Michael means when he says "the 4th ASM block doesn't list "EBX" in the clobber list (but its contents are destroyed)". If we look at your asm:

asm ("movl $0x0, %%edx;"
     "movl %2, %%eax;"
     "movl %3, %%ebx;"
     "idivl %%ebx;" 
  : "=a" (quo), "=d" (rem) 
  : "g" (arg1), "g" (arg2));

You see that the 3rd line is moving a value into ebx, but there's nothing in the constraints that follow to say that it is going to be changed. The fact that your program is crashing is probably due to gcc using that register for something else. The simplest fix might be to "list EBX in the clobber list":

asm ("movl $0x0, %%edx;"
     "movl %2, %%eax;"
     "movl %3, %%ebx;"
     "idivl %%ebx;" 
  : "=a" (quo), "=d" (rem) 
  : "g" (arg1), "g" (arg2)
  : "ebx");

This tells gcc that ebx may be changed by the asm (aka it 'clobbers' it), and that it doesn't need to have any particular value when the asm statement begins, and won't have any particular value in it when the asm exits.

However, while that may be 'simplest,' it isn't necessarily the best. For example instead of using the "g" constraint for arg2, we can use the "b" constraint:

asm ("movl $0x0, %%edx;"
     "movl %2, %%eax;"
     "idivl %%ebx;" 
  : "=a" (quo), "=d" (rem) 
  : "g" (arg1), "b" (arg2));

This lets us get rid of the movl %3, %%ebx statement, since gcc will ensure the value is in ebx before calling the asm, and we don't need to clobber it anymore.

But why use ebx? idiv doesn't require any particular register there, and maybe gcc is already using ebx for something else. How about letting gcc just pick some register it isn't using? We do this using the "r" constraint:

asm ("movl $0x0, %%edx;"
     "movl %2, %%eax;"
     "idivl %3;" 
  : "=a" (quo), "=d" (rem) 
  : "g" (arg1), "r" (arg2));

Notice that the idiv now uses %3, which means "use the thing that is in the (zero-based) parameter #3." In this case, that's the register that contains arg2.

However, we can still do better. As you have already seen in your previous asm statements, you can use the "a" constraint to tell gcc to put a particular variable into the eax register. Which means we can do this:

asm ("movl $0x0, %%edx;"
     "idivl %3;" 
  : "=a" (quo), "=d" (rem) 
  : "a" (arg1), "r" (arg2));

Again, 1 fewer instruction since we don't need to move the value into eax anymore. So how about that movl $0x0, %%edx thing? Well, we can get rid of that too:

asm ("idivl %3"
  : "=a" (quo), "=d" (rem) 
  : "a" (arg1), "r" (arg2), "d" (0));

This uses the "d" constraint to put 0 into edx before executing the asm. That brings us to my final version:

asm ("idivl %3"
  : "=a" (quo), "=d" (rem) 
  : "a" (arg1), "r" (arg2), "d" (0)
  : "cc");

This says:

  • On input, put arg1 into eax, arg2 into some register (that we'll refer to using %3), and 0 into edx.
  • On output, eax will contain the quotient, edx will contain the remainder. This is how the idiv instruction works.
  • The "cc" clobber tells gcc that your asm modifies the flags registers (eflags), which idiv does as a side effect.

Now, despite having described all this, I usually think using inline asm is a bad idea. It's cool, it's powerful, it gives interesting insight into how the gcc compiler works. But look at all the weird things you "just have to know" in order to work with this. And as you have noticed, if you get any of them wrong, weird things can happen.

It's true all these things are documented in gcc's docs. The simple constraints (like "r" and "g") are doc'ed here. The specific register constraints for the x86 are in the 'x86 family' here. And the detailed description of all the asm features is here. So if you must use this stuff (for example if you are supporting some existing code that uses this), the information is out there.

But there's a much shorter read here that gives you a whole list of reasons not to use inline asm. That's the read I'd recommend. Stick with C, and let the compiler handle all that register junk for you.

PS While I'm at this:

asm ( "addl %2, %0" : "=r" (add) : "0" (arg1) , "r" (arg2) : "cc");
asm ( "subl %2, %0" : "=r" (sub) : "0" (arg1) , "r" (arg2) : "cc");
asm ( "imull %2, %0" : "=r" (mul) : "0" (arg1) , "r" (arg2) : "cc");

Check out the gcc docs to see what it means to use a digit in an input operand.



回答2:

David Wohlferd has given a very good answer on how to better work with GCC extended assembly templates to do the work of your existing code.

A question may arise as to why the code presented fails with Codeblocks 16.01 w/GCC where as it may have worked previously. As it stands the code looks pretty simple, so what could have possibly gone wrong?

The best thing I recommend is learning to use the debugger and set break points in Codeblocks. It is very simple (but beyond the scope of this answer). You can learn more about debugging in the Codeblocks documentation.

If you used the debugger with Codeblocks 16.01, with a stock C++ console project you may have discovered that the program is giving you an Arithmetic Exception on the IDIV instruction in the assembly template. This is what appears in my console output:

Program received signal SIGFPE, Arithmetic exception.


These lines of code do as you would expect:

asm ( "addl %%ebx, %%eax;" : "=a" (add) : "a" (arg1) , "b" (arg2) );
asm ( "subl %%ebx, %%eax;" : "=a" (sub) : "a" (arg1) , "b" (arg2) );
asm ( "imull %%ebx, %%eax;" : "=a" (mul) : "a" (arg1) , "b" (arg2) );

This is where was have issues:

asm ( "movl $0x0, %%edx;"
      "movl %2, %%eax;"
      "movl %3, %%ebx;"
      "idivl %%ebx;" : "=a" (quo), "=d" (rem) : "g" (arg1), "g" (arg2) );

One thing Codeblocks can do for you is show you the assembly code it generated. Pull down the Debug menu, select Debugging Windows > and Disassembly. The Watches and CPU Registers windows I highly recommend as well.

If you review the generated code with CodeBlocks 16.01 w/GCC you might discover it produced this:

/* Automatically produced by the assembly template for input constraints */
mov    -0x20(%ebp),%eax      /* EAX = value of arg1 */
mov    -0x24(%ebp),%edx      /* EDX = value of arg2 */

/* Our assembly template instructions */
mov    $0x0,%edx             /* EDX = 0 - we just clobbered the previous EDX! */
mov    %eax,%eax             /* EAX remains the same */
mov    %edx,%ebx             /* EBX = EDX = 0. */
idiv   %ebx                  /* EBX is 0 so this is division by zero!! *

/* Automatically produced by the assembly template for output constraints */
mov    %eax,-0x18(%ebp)      /* Value at quo = EAX */
mov    %edx,-0x1c(%ebp)      /* Value at rem = EDX */

I have commented the code and it should be obvious why this code won't work. We effectively ended up placing zero in EBX and then attempted to use that as a divisor with IDIV and that produced an arithmetic exception (division by zero in this case).

This happened because GCC will (by default) assume that all the input operands are used (consumed) BEFORE the output operands are written to. We never told GCC that it couldn't potentially use the same input operands as output operands. GCC considers this situation an Early Clobber. It provides a mechanism to mark an output constraint as early clobber using & (ampersand) modifier:

`&'

Means (in a particular alternative) that this operand is an earlyclobber operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address.

By changing the operands so that the early clobbers are dealt with, we can place & on both the output constraints like this:

"idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) );

In this case arg1 and arg2 will not be passed in through any of the operands marked with &. This means this code will avoid using EAX and EDX for the input operands arg1 and arg2.

The other issue is that EBX is modified by your code but you don't tell GCC. You could simply add EBX to the clobber list in the assembly template like this:

"idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) : "ebx");

So this code should work, but is not efficient:

asm ( "movl $0x0, %%edx;"
      "movl %2, %%eax;"
      "movl %3, %%ebx;"
      "idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) : "ebx");

The generated code will now look something like:

/* Automatically produced by the assembler template for input constraints */
mov    -0x30(%ebp),%ecx      /* ECX = value of arg1 */
mov    -0x34(%ebp),%esi      /* ESI = value of arg2 */

/* Our assembly template instructions */
mov    $0x0,%edx             /* EDX = 0 */
mov    %ecx,%eax             /* EAX = ECX = arg1 */
mov    %esi,%ebx             /* EBX = ESI = arg2 */
idiv   %ebx

/* Automatically produced by the assembler template for output constraints */
mov    %eax,-0x28(%ebp)      /* Value at quo = EAX */
mov    %edx,-0x2c(%ebp)      /* Value at rem = EDX */

This time the input operands for arg1 and arg2 didn't share the same registers that would conflict with the MOV instructions inside our inline assembly template.


Why other (including older) versions of GCC work?

If GCC had generated instructions using registers other than EAX, EDX, and EBX for arg1 and arg2 operands then it would have worked. But the fact it may have worked was just by luck. To see what happend with older Codeblocks and the GCC that came with it, I'd recommend reviewing the code generated in that environment the same way I have discussed above.

Early clobbering, and register clobbering in general is a reason that extended assembler templates can be tricky, and a reason extended assembler templates should be used sparingly especially if you don't have a solid understanding.

You can create code that appears to work, but is coded incorrectly. A different version of GCC or even different optimization levels may alter the behaviour of the code. Sometimes these bugs can be so subtle that as a program grows the bug manifests itself in other ways that may be hard to trace.

Another rule of thumb is that not all code you find on the internet is bug free, and the subtle complexities of extended inline assembly is often overlooked in tutorials. I discovered the code you used seems to be based on this Code Project. Unfortunately the author didn't have a thorough understanding of the intracies involved. The code may have worked at the time, but not necessarily now.