x86 assembly design has instruction suffix, such as l(long)
, w(word)
, b(byte)
.
So I thought that jmpl
to be long jmp
But it worked quite weird when I compile it.
See below example.
Test1: assembly
main:
jmp main
Test1: compile result
eb fe jmp 0x0804839b <main>
Test2 : assembly
main:
jmpl main # added l suffix
Test2 : Compile result
ff 25 9b 83 04 08 jmp *0x0804839b
Compared to Test1, Test2 result is unexpected.
I think It should be compiled as same as Test1.
Question:
Is jmpl
something different instruction in 8086 design?
(according to here, jmpl
in SPARK means jmp link. is it something like this?)
...Or is this just bug on gnu assembler?
An l
operand-size suffix implies an indirect jmp
, unlike with calll main
which is still a relative near-call. This inconsistency is pure insanity in AT&T syntax design.
(And since you're using it with an operand like main
, it becomes a memory-indirect jump, doing a data load from main
and using that as the new EIP value.)
You never need to use the jmpl
mnemonic, you can and should indicate indirect jumps using *
on the operand. Like jmp *%eax
to set EIP = EAX, or jmp *4(%edi, %ecx, 4)
to index a jump table, or jmp *func_pointer
. Using jmpl
is optional in all of these.
You could use jmpw *%ax
to truncate EIP to a 16-bit value. That assembles to 66 ff e0 jmpw *%ax
)
Compare What is callq instruction? and What is the difference between retq and ret?, that's just the operand-size suffix behaving like you expected it would, same as plain call
or plain ret
. But jmp
is different.
semi-related: far jmp or call (to a new CS:[ER]IP) in AT&T syntax is ljmp / lcall. These are very different.
It's also insane that GAS accepts jmpl main
as equivalent to jmpl *main
. It only warns instead of erroring.
$ gcc -no-pie -fno-pie -m32 jmp.s
jmp.s: Assembler messages:
jmp.s:3: Warning: indirect jmp without `*'
And then disassembling it to see what we got, with objdump -drwC a.out
:
08049156 <main>: # corresponding source line (added by hand)
8049156: ff 25 56 91 04 08 jmp *0x8049156 # jmpl main
804915c: ff 25 56 91 04 08 jmp *0x8049156 # jmp *main
8049162: ff 25 56 91 04 08 jmp *0x8049156 # jmpl *main
08049168 <foo>:
8049168: e8 fb ff ff ff call 8049168 <foo> # calll foo
804916d: ff 15 68 91 04 08 call *0x8049168 # calll *foo
8049173: ff 15 68 91 04 08 call *0x8049168 # call *foo
We get the same thing if we replace l
with q
in the source, and built without -m32
(using the default -m64
). Including the same warning about a missing *
. But the disassembly has an explicit jmpq
and callq
on every instruction. (Except for a relative direct jmp
I added, which uses the jmp
mnemonic in the disassembly.)
It's like objdump thinks 32-bit is the default operand-size for jmp/call in both 32 and 64-bit mode, so it wants to always use a q
suffix in 64-bit, but leaves it implicit in 32-bit mode. Anyway, that's just disassembly choice between implicit / explicit size suffixes, no weirdness for a programmer writing source code.
Other AT&T-syntax assemblers:
Clang's built-in assembler does reject jmpl main
, requiring jmpl *main
.
$ clang -m32 jmp.s
jmp.s:3:8: error: invalid operand for instruction
jmpl main
^~~~
calll main
is the same as call main
. call *main
and calll *main
are both accepted for indirect jumps.
YASM's GAS-syntax mode assembles jmpl main
to a near relative jmp, like jmp main
! So it disagrees with gcc/clang about jmpl
implying indirect. (Very few people use YASM in GAS mode; and these days its maintenance hasn't kept up with NASM for new instructions like AVX512. I like YASM's good defaults for long NOPs, but otherwise I'd recommend NASM.)
You have fallen victim to the awfulness that is AT&T syntax.
x86 assembly design has instruction suffix, such as l(long), w(word), b(byte).
No, it doesn't. The abomination that is AT&T syntax has this.
In the sane Intel syntax there are no such suffixes.
Is jmpl something different.
Yes, this is an indirect jump to an absolute address. A -near- jump to a -long- address ljmp
in gnu syntax is a -far- jump.
The default for a jump is a near jump, to a relative address.
Note that the Intel syntax for this jump is:
jmp dword [ds:0x0804839b] //note the [] specifying the indirectness.
//or, this is the same
jmp [0x0804839b]
//or
jmp [main]
//or
jmp DWORD PTR ds:0x804839f //the PTR makes it indirect.
I prefer the []
, to highlight the indirectness.
It does not jump to 0x0804839b, but reads a dword from the specified address and then jumps to the address specified in this dword. In the Intel syntax the indirectness is explicit.
Of course you intended to jump to 0x0804839b (aka main:) directly, which is done by:
Hm, most assembler do not allow absolute far jumps!
It cannot be done.
See also: How to code a far absolute JMP/CALL instruction in MASM?
A near/far relative jump is (almost) always better, because it will still be valid when your code changes, the long jump can become invalid.
Also shorter instructions are usually better, because they occupy less space in the instruction cache. The assembler (in Intel mode) will automatically select the correct jmp encoding for you.
SPARC
This is a totally different processor than the x86. From a different manufacturer, using a different paradigm. Obviously the SPARC documentation bears no relation to the x86 docs.
The correct documentation for jmp
is here.
https://www.felixcloutier.com/x86/jmp
Note that Intel does not specify different syntaxes for the relative and absolute forms of the jmp. This is because Intel want to assembler to always use the short (relative) jump, unless the target is too far away, in which case the far jump (jmpl in AT&T syntax) is used.
The beauty of this is that the assembler automatically uses the proper jump for you.
Forcing gnu back to sanity
You can use
.intel_syntax noprefix <<-- as the first line in your assembly
mov eax,[eax+100+ebx*2]
....
To make gnu use Intel syntax, this will put things back the way they are designed by Intel and away from the PDP7 syntax used by gnu.