Ok so I am trying to assemble some code in assembly using nasm -f elf final.asm
:
xor eax,eax
push eax
push dword(0x75792273)
push dword(0x70742027)
push dword(0x77777875)
push dword(0x20237678)
push dword(0x76727827)
push dword(0x27797175)
push dword(0x75711225)
push dword(0x72747676)
push dword(0x74231476)
push dword(0x70707470)
push dword(0x23247077)
push dword(0x78707822)
push dword(0x24711174)
push dword(0x22707373)
push dword(0x78717974)
push dword(0x75771777)
push dword(0x70777125)
push dword(0x73731472)
push dword(0x71277377)
push dword(0x79251822)
push dword(0x79707478)
push dword(0x78742779)
push dword(0x72727871)
push dword(0x71251475)
push dword(0x27247772)
push dword(0x79757479)
push dword(0x70227071)
push dword(0x77737420)
push dword(0x70251970)
push dword(0x74747127)
push dword(0x23277677)
push dword(0x79712024)
push esp
pop esi
mov edi,esi
mov edx,edi
cld
mov ecx,0x80
mov ebx,0x41
xor eax,eax
push eax
lods byte[esi]
xor eax,ebx
stos byte[es:edi]
loop 0xb7
push esp
pop esi
int 0x3
Which results in the following error:
final.asm:44: error: parser: instruction expected
final.asm:46: error: parser: instruction expected
I found the answer to these errors at:
NASM: parser: instruction expected rep movs
Basically, this says that the lods and stos instructions are not recognized by NASM. Which means I need to convert them into something NASM does recognize so that I get the same result.
My question is, what can I change these two lines to so that NASM can compile it so that I can ultimately debug it.
what lodsb
does is:
mov al,[esi]
inc esi ; (or dec, according to direction flag)
you could also use
lodsw
to load words (to ax
, increase esi
by 2), or
lodsd
to load dwords (to eax
, increase esi
by 4).
and stosb
does
mov [es:edi],al
inc edi
same here, stosw
and stosd
will store 2 or 4 bytes (and adjusting edi
accordingly)
First loads from memory, pointed to by the SOURCE (ESI) register, latter writes to memory pointed by the DESTINATION (ES:EDI) register.
You don't need to (and cannot) specify which registers will be used. Source will always be ESI, and Destination always EDI
Edit on segment registers:
The lods
instruction can be used together with segment override prefix (i.e. ss lodsb
). The stos
instruction is fixed to es
(missing detail in original answer) segment usage, and can't be overridden.
The movsb/movsw/movsd
instructions (size*(mov [es:edi],[ds:esi] inc esi inc edi)
) can be also overridden on the source side, ie. es movsb
will fetch bytes from es:esi
instead of ds:esi
, but the destination segment register is fixed to es
.
Use lodsb
/ lodsw
/ lodsd
/ lodsq
to indicate operand-size with the mnemonic itself, not with operands.
Remove the byte [esi]
part, NASM won't accept explicit operands for string instructions.
Intel's LODS documentation suggests that you can use operands as documentation and to imply an operand-size (and segment override), like you're trying to do, as an alternative to an operand-size suffix.
This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source operand symbol must specify the correct type (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct location. The location is always specified by the DS:(E)SI registers, which must be loaded correctly before the load string instruction is executed.
Presumably the designers of NASM syntax decided that allowing lods byte [r15]
to assemble was a bad idea, and disallowing the one-operand form entirely was easier than writing a bunch of code just to check that the given operand is what it's supposed to be.
Since NASM has a prefix syntax for segment/operand/address overrides, fs lodsb
lets you write what would otherwise need an operand to attach a segment override to (like lodsb fs:[rsi]
in MASM syntax.)
Doing it this way makes the string instructions non-special as far as the assembler is concerned; they're just another entry in a table mapping mnemonics to opcodes. If Intel's own syntax included mnemonic prefixes for machine-code prefix bytes, they might have made the same design choice.
Fun fact: STOS's segment can't be overridden (from ES). Perhaps Intel wanted to share more transistors with the original 8086 implementation of MOVS, where a segment override only affects the [DS:SI]
source, not the [ES:DI]
destination.
other assemblers:
GNU .intel_syntax
supports segment prefix syntax, but not NASM's o16
/o32
/o64
or a16
/a32
/a64
operand and address-size specifiers.
# assembled with as --32 disassembled with ndisasm -b 32
.intel_syntax noprefix
mov al, byte ptr fs:[esi]
00000038 648A06 mov al,[fs:esi]
gs lodsb
0000003B 65AC gs lodsb
lods dword ptr ss:[ecx]
# Warning: `dword ptr ss:[ecx]' is not valid here (expected `[esi]')
0000003D 36AD ss lodsd
ss lodsd [si]
0000003F 3667AD ss a16 lodsd
lods eax, dword ptr ss:[esi]
00000042 36AD ss lodsd
#lods al # Error: operand type mismatch for `lods'
#fs es lodsd # Error: same type of prefix used twice
#a16 lodsb # Error: no such instruction: `a16 lodsb'
I don't see a way to write an address-size override without using an explicit operand for string instructions in GNU syntax (AT&T or Intel).
objdump -Mintel
output of the same:
4: 64 8a 06 mov al,BYTE PTR fs:[esi]
7: 65 ac lods al,BYTE PTR gs:[esi]
9: 36 ad lods eax,DWORD PTR ss:[esi]
b: 36 67 ad lods eax,DWORD PTR ss:[si]
e: 36 ad lods eax,DWORD PTR ss:[esi]