I'm facing a weird situation where a batch file I wrote reports an incorrect exit status. Here is a minimal sample that reproduces the problem:
bug.cmd
echo before
if "" == "" (
echo first if
exit /b 1
if "" == "" (
echo second if
)
)
echo after
If I run this script (using Python but the problem actually occurs when launched in other ways too), here is what I get:
python -c "from subprocess import Popen as po; print 'exit status: %d' % po(['bug.cmd']).wait()"
echo before
before
if "" == "" (
echo first if
exit /b 1
if "" == "" (echo second if )
)
first if
exit status: 0
Note how exit status
is reported as 0
even though exit /b 1
should make it be 1
.
Now the weird thing is that if I remove the inside if
clause (which should not matter because everything after exit /b 1
should not be executed anyway) and try to launch it:
ok.cmd
echo before
if "" == "" (
echo first if
exit /b 1
)
echo after
I launch it again:
python -c "from subprocess import Popen as po; print 'exit status: %d' % po(['ok.cmd']).wait()"
echo before
before
(environment) F:\pf\mm_3.0.1\RendezVous\Services\Matchmaking>if "" == "" (
echo first if
exit /b 1
)
first if
exit status: 1
Now the exit status
is correctly reported as 1
.
I'm at loss understanding what is causing this. Is it illegal to nest if
statements ?
How can I signal correctly and reliably my script exit status in batch ?
Note: calling exit 1
(without the /b
) is not an option as it kills the whole interpreter and prevents local script usage.
Wow! that is freaky!
I am able to reproduce the apparent bug from the command line console by running the following (note I use
/Q
to turn ECHO OFF so output is simpler):I get the same behavior if I rename the script to "bug.bat"
I also get the expected return code of 1 if I remove the 2nd IF.
I agree, this seems to be a bug. Logically, I see no reason for the two similar scripts to yield different results.
I don't have a full explanation, but I believe I understand an important component to the behavior: The batch ERRORLEVEL and the exit code do not refer to the same thing! Below is the documentation for the EXIT command. The important bit is the description of the exitCode parameter.
I think the average person (including myself) does not typically distinguish between the two. But CMD.EXE seems to be very finicky as to when the batch ERRORLEVEL is returned as the exit code.
It is easy to show that the batch script is returning the correct ERRORLEVEL, yet the ERRORLEVEL is not being returned as the CMD exit code. I display the ERRORLEVEL twice to demonstrate that the act of displaying it is not clearing the ERRORLEVEL.
As others have pointed out, using CALL does cause the ERRORLEVEL to be returned as the exit code:
But that doesn't work if another command is executed after the CALL
Note that the above behavior is strictly a function of CMD.EXE, having nothing to do with the script, as evidenced by:
You could explicitly EXIT with the ERRORLEVEL at the end of the command chain:
Here is the same thing without delayed expansion:
Perhaps the simplest/safest work around is to change your batch script to
EXIT 1
instead ofEXIT /B 1
. But that may not be practical, or desirable, depending on how others may use the script.EDIT
I've reconsidered, and now think it is most likely an unfortunate design "feature" rather than a bug. The IF statements are a bit of a red herring. If a command is parsed after EXIT /B, within the same command block, then the problem manifests, even though the subsequent command never executes.
test.bat
Here are some test runs showing that the behavior is the same:
It doesn't matter what the 2nd command is. The following script shows the same behavior:
The rule is that if the subsequent command would execute if the EXIT /B were something that didn't exit, then the problem manifests itself.
For example, this has the problem:
But the following works fine without any problem.
And so does this work
@dbenham's answers are good. I am not trying to suggest otherwise. But, I have found it reliable to use a variable for the return code and a common exit point. Yes, it takes a few extra lines, but also allows additional cleanup that, if necessary, would have to be added to every exit point.
As @dbenham notes, "[i]f a command is parsed after
EXIT /B
, within the same command block, then the problem manifests, even though the subsequent command never executes". In this particular case the body of theIF
statement is basically evaluated aswhere the
&
operator is the functioncmd!eComSep
(i.e. command separator). TheEXIT /B 1
command (functioncmd!eExit
) is evaluated by setting the global variablecmd!LastRetCode
to 1 and then basically executingGOTO :EOF
. When it returns, the secondeComSep
seescmd!GotoFlag
is set and so skips evaluating the right-hand side. In this case, it also ignores the return code of the left-hand side to instead returnSUCCESS
(0). This gets passed up the stack to become process exit code.Below I've included the debug sessions for running bug.cmd and ok.cmd.
bug.cmd:
ok.cmd:
In the ok.cmd case,
cmd!eComSep
only appears once in the stack trace. Theexit /b 1
command is evaluated as the right-hand side operand, so the code that looks atGotoFlag
never runs. Instead the return code of 1 gets passed up the stack to become the process exit code.I'm going to try to join the answers from dbenham (that checked the cases from batch code) and eryksum (that directly went to the code). Maybe doing it I could understand it.
Let's start with a
bug.cmd
From the eryksum answer and tests we know this code will set
errorlevel
variable to 1, but the general result of the command is not a failure as the inner functions insidecmd
will process the concatenation operator as a function call that will return (meaning a C function returning a value) the result of the right command. This can be tested asYes,
errorlevel
is 1 but conditional execution will run the code after the&&
as the previous command (eComSep
) returnedSUCESS
.Now, executed in a separate
cmd
instanceHere the same process that makes the conditional execution "fail" in the previous case propagates the
errorlevel 0
out of the newcmd
instance.But, why does the
call
case work?It works because the
cmd
is coded something like (rough assembler to C)That is, the
call
command is handled in functioneCall
that callsCallWork
to delegate the context generation and execution toBatProc
.BatProc
returns the resulting value from execution of the code. We know from previous the tests that this value is 0 (buterrorlevel / LastRetCode
is 1). This value is tested insideCallWork
(the ternary?
operator): if theBatProc
return value is not 0, return the value else returnLastRetCode
, that in this case is 1. Then this value is used insideeCall
as return value AND stored insideLastRetCode
(the=
in the return command is an asignation) so it is returned inerrorlevel
.If i didn't miss something, the rest of the cases are just variations over the same behaviour.
The following is working ok invoking the bat with CALL:
bug.bat:
test.bat: