I am going through some example assembly code for 16-bit real mode.
I've come across the lines:
mov bx, cs
mov ds, bx
mov si, OFFSET value1
pop es
mov di, OFFSET value2
what is this doing? What does having 'OFFSET' there do?
As some of the other answers say, the offset
keyword refers to the offset from the segment in which it is defined. Note, however, that segments may overlap and the offset in one segment may be different in another segment. For instance, suppose you have the following segment in real mode
data SEGMENT USE16 ;# at segment 0200h, linear address 2000h
org 0100h
foo db 0
org 01100h
bar db 0
data ENDS
The assembler sees that foo
is at offset 0100h
from the base of data SEGMENT
, so wherever it sees offset foo
it will put the value 0100h
, regardless of the value of DS
at the time.
For example, if we change DS
to something other than the base of the data
segment the assembler is assuming:
mov ax, 200h ; in some assemblers you can use @data for the seg base
mov ds, ax
mov bx, offset foo ; bx = 0100h
mov byte ptr [bx], 10 ; foo = 10
mov ax, 300h
mov ds, ax
mov bx, offset foo ; bx = 0100h
mov byte ptr [bx], 10 ; bar = 10, not foo, because DS doesn't match what we told the assembler
In the second example DS
is 0300h
, so the base of the segment pointed to by DS
is 03000h
. This means that ds:[offset foo]
points to the address 03000h + 0100h
which is the same as 02000h + 01100h
, which points to bar
.
It just means the address of that symbol. It's a bit like the & operator in C, if you are familiar with that.
offset
means that si
register will be equal to the offset of the variable value1 (not to its actual value). Offset is the address from the beginning of memory segment where the variable is stored. The offset is usually relative to ds
segment (in your case ds
and cs
registers are pointing to the same segment).
From MASM Programmer's Guide 6.1 (Microsoft Macro Assembler)
The OFFSET Operator
An address constant is a special type of immediate operand that consists of an offset or segment value. The OFFSET operator returns the offset of a memory location, as shown here:
mov bx, OFFSET var ; Load offset address
For information on differences between MASM 5.1 behavior and MASM 6.1 behavior related to OFFSET, see Appendix A.
Since data in different modules may belong to a single segment, the assembler cannot know for each module the true offsets within a segment. Thus, the offset for var, although an immediate value, is not determined until link time.
If you read carefully, the final value is determined after you "link" your object code to create a DLL/EXE. Prior to linking, all you have is an immediate value which represents the offset from the segment's base address.
In x86 16bit mode, address space is not flat; instead, addresses are composed of an offset and a "segment". The "segment" points to a 64K space, offset is within that space.
See http://en.wikipedia.org/wiki/Memory_segmentation
Offset is basically the distance from the segment point(also called datum point).
for example segment address is 0000 and the offset or logical address is 0100 then the physical address can be counted by adding the two pairs.
Physical Address = 0000+0100=0100
Means that our required location is on the address of 0100.
Similarly if segment address is 1DDD and offset is 0100 then :
Physical address is : 1DDD+0100=1EDD
Means that our destination is 1EDD.