Writing a C function from given x86 assembly

I'm trying to reverse engineer this mystery function. This function returns an integer and takes a struct node as an argument

#include "mystery.h"
int mystery(struct e4_struct *s){}

The header file is a simple struct declaration

struct my_struct {
    int a;
    int b; 
};

The assembly I'm trying to reverse engineer is

400596:    8b 07                    mov    (%rdi),%eax
400598:    8d 04 40                 lea    (%rax,%rax,2),%eax
40059b:    89 07                    mov    %eax,(%rdi)
40059d:    83 47 04 07              addl   $0x7,0x4(%rdi)
4005a1:    c3                       retq

So far I think the function is like:

int mystery(struct m_struct *s){
    int i = s->a;
    i = 3*i;
    int j = s->b;
    j += 7;
    return i;
}

But this isn't correct. I don't understand what mov %eax,(%rdi)does exactly and what the function returns in the end because its supposed to return and integer.

标签： c gcc assembly x86-64 reverse-engineering

1条回答

我只想做你的唯一

2楼-- · 2019-06-24 10:29

Given that RDI is the pointer to the beginning of the structure (first parameter of function) the following line is getting the value of s->aand placing it in a temporary register EAX.

mov    (%rdi),%eax

Reasonably that might be int x = s->a. This line:

lea    (%rax,%rax,2),%eax

Is the same as multiplying the temp value by 3 since RAX+RAX*2=3*RAX (thus s->a * 3). So the first two lines of assembly could be represented as:

int x = s->a * 3;

The line mov %eax,(%rdi) would be taking the temporary value x and storing it back to s->a so that could be represented as:

s->a = x;

The line addl $0x7,0x4(%rdi) is adding 7 to the value at 4(RDI). 4(RDI) is the address of s->b. This line could be represented as s->b += 7;.

So what is being returned as a value? Since nothing else is done with EAX after the code analyzed above, EAX is is still the value it had earlier when we did x = s->a * 3;. This means that the function is returning the temporary value x.

The code then would look like this:

int mystery(struct my_struct *s)
{
    int x = s->a * 3;
    s->a = x;
    s->b += 7;
    return x;    
}

If you compile this code with GCC 4.9.x on godbolt with -O1 optimization level we get this generated assembly:

mystery:
        movl    (%rdi), %eax
        leal    (%rax,%rax,2), %eax
        movl    %eax, (%rdi)
        addl    $7, 4(%rdi)
        ret

Different compilers with different optimizations levels will produce different assembly that will all do the same thing. GCC 4.9.x just so happens to produce the exact assembly code we originally reverse engineered.

Note: I guessed on the version of compiler and optimization level because of a recent SO question with a different mystery function where I had found GCC 4.9.x with optimization level -O1 generated the exact code I was looking for. It seems whoever generated the assembly files for these mystery exercises was using such settings and similar compiler.

0人赞添加讨论(0) 举报

Writing a C function from given x86 assembly

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间