I am trying to trace a little program using ptrace API. I figured out that every time the tracer is run, it produces bad results. This is the disassembly of short program that I want to trace:
$ objdump -d -M intel inc_reg16
inc_reg16: file format elf32-i386
Disassembly of section .text:
08048060 <.text>:
8048060: b8 00 00 00 00 mov eax,0x0
8048065: 66 40 inc ax
8048067: 75 fc jne 0x8048065
8048069: 89 c3 mov ebx,eax
804806b: b8 01 00 00 00 mov eax,0x1
8048070: cd 80 int 0x80
and here is code of the tracer itself:
// ezptrace.c
#include <sys/user.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
int main() {
pid_t child;
child = fork();
if (child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execv("inc_reg16", NULL);
}
else {
int status;
wait(&status);
struct user_regs_struct regs;
while (1) {
ptrace(PTRACE_GETREGS, child, NULL, ®s);
printf("eip: %x\n", (unsigned int) regs.eip);
ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
waitpid(child, &status, 0);
if(WIFEXITED(status)) break;
}
printf("end\n");
}
return 0;
}
The tracer's job is to single step the inc_reg16 program and log address of each encountered processor instruction. When I run and check how many times the instruction 'inc ax' has been encountered, it occurs that the numbers are different each time the tracer is run:
$ gcc ezptrace.c -Wall -o ezptrace
$ ./ezptrace > inc_reg16.log
$ grep '8048065' inc_reg16.log | wc -l
65498
the second check:
$ ./ezptrace > inc_reg16.log
$ grep '8048065' inc_reg16.log | wc -l
65494
The problem is that above results should be both 65536, as the instruction 'inc ax' is executed exactly 65536 times. Now the question is: is there a mistake in my code or it's a matter of some bug in ptrace? Your help is greatly appreciated.
I tried the same program under both virtualbox and vmware, it seems that only vmware has the correct result, whereas virtualbox has the same problem as you. I used the virtualbox 4.2.1.
eip is the address to the "current instruction" in user space. You need a ptrace(...PEEKDATA, ...), i.e. following ptrace(...GETREGS, ...), to obtain the actual instruction. Also keep in mind that, with ptrace(...PEEKDATA, ...) you always obtain a machine word, actual opcodes usually only occupy the low 16/32 bits of it.