Can a running C program access its own symbol tabl

2019-04-20 01:16发布

问题:

I have a linux C program that handles request sent to a TCP socket (bound to a particular port). I want to be able to query the internal state of the C program via a request to that port, but I dont want to hard code what global variables can be queried. Thus I want the query to contain the string name of a global and the C code to look that string up in the symbol table to find its address and then send its value back over the TCP socket. Of course the symbol table must not have been stripped. So can the C program even locate its own symbol table, and is there a library interface for looking up symbols given their name? This is an ELF executable C program built with gcc.

回答1:

This is actually fairly easy. You use dlopen / dlsym to access symbols. In order for this to work, the symbols have to be present in the dynamic symbol table. There are multiple symbol tables!

#include <dlfcn.h>
#include <stdio.h>

__attribute__((visibility("default")))
const char A[] = "Value of A";

__attribute__((visibility("hidden")))
const char B[] = "Value of B";

const char C[] = "Value of C";

int main(int argc, char *argv[])
{
    void *hdl;
    const char *ptr;
    int i;

    hdl = dlopen(NULL, 0);
    for (i = 1; i < argc; ++i) {
        ptr = dlsym(hdl, argv[i]);
        printf("%s = %s\n", argv[i], ptr);
    }
    return 0;
}

In order to add all symbols to the dynamic symbol table, use -Wl,--export-dynamic. If you want to remove most symbols from the symbol table (recommended), set -fvisibility=hidden and then explicitly add the symbols you want with __attribute__((visibility("default"))) or one of the other methods.

~ $ gcc dlopentest.c -Wall -Wextra -ldl
~ $ ./a.out A B C
A = (null)
B = (null)
C = (null)
~ $ gcc dlopentest.c -Wall -Wextra -ldl -Wl,--export-dynamic
~ $ ./a.out A B C
A = Value of A
B = (null)
C = Value of C
~ $ gcc dlopentest.c -Wall -Wextra -ldl -Wl,--export-dynamic -fvisibility=hidden
~ $ ./a.out A B C
A = Value of A
B = (null)
C = (null)

Safety

Notice that there is a lot of room for bad behavior.

$ ./a.out printf
printf = ▯▯▯▯ (garbage)

If you want this to be safe, you should create a whitelist of permissible symbols.



回答2:

file: reflect.c

#include <stdio.h>
#include "reflect.h"

struct sym_table_t gbl_sym_table[1] __attribute__((weak)) = {{NULL, NULL}};

void * reflect_query_symbol(const char *name)
{
    struct sym_table_t *p = &gbl_sym_table[0];

    for(; p->name; p++) {
        if(strcmp(p->name, name) == 0) {
            return p->addr;
        }
    }
    return NULL;
}

file: reflect.h

#include <stdio.h>

struct sym_table_t {
    char *name;
    void *addr;
};

void * reflect_query_symbol(const char *name);

file: main.c

just #include "reflect.h" and call reflect_query_symbol

example:

#include <stdio.h>
#include "reflect.h"

void foo(void)
{
    printf("bar test\n");
}

int uninited_data;

int inited_data = 3;

int main(int argc, char *argv[])
{
    int i;
    void *addr;

    for(i=1; i<argc; i++) {
        addr = reflect_query_symbol(argv[i]);
        if(addr) {
            printf("%s lay at: %p\n", argv[i], addr);
        } else {
            printf("%s NOT found\n", argv[i], addr);
        }
    }

    return 0;
}

file:Makefile

objs = main.o reflect.o

main: $(objs)
        gcc -o $@ $^
        nm $@ | awk 'BEGIN{ print "#include <stdio.h>"; print "#include \"reflect.h\""; print "struct sym_table_t gbl_sym_table[]={" } { if(NF==3){print "{\"" $$3 "\", (void*)0x" $$1 "},"}} END{print "{NULL,NULL} };"}' > .reflect.real.c
        gcc -c .reflect.real.c -o .reflect.real.o
        gcc -o $@ $^ .reflect.real.o
        nm $@ | awk 'BEGIN{ print "#include <stdio.h>"; print "#include \"reflect.h\""; print "struct sym_table_t gbl_sym_table[]={" } { if(NF==3){print "{\"" $$3 "\", (void*)0x" $$1 "},"}} END{print "{NULL,NULL} };"}' > .reflect.real.c
        gcc -c .reflect.real.c -o .reflect.real.o
        gcc -o $@ $^ .reflect.real.o


回答3:

The general term for this sort of feature is "reflection", and it is not part of C.

If this is for debugging purposes, and you want to be able to inspect the entire state of a C program remotely, examine any variable, start and stop its execution, and so on, you might consider GDB remote debugging:

GDB offers a 'remote' mode often used when debugging embedded systems. Remote operation is when GDB runs on one machine and the program being debugged runs on another. GDB can communicate to the remote 'stub' which understands GDB protocol via Serial or TCP/IP. A stub program can be created by linking to the appropriate stub files provided with GDB, which implement the target side of the communication protocol. Alternatively, gdbserver can be used to remotely debug the program without needing to change it in any way.