Why does the linker modify a --defsym “absolute ad

2019-06-19 19:02发布

问题:

Goal: a shared library to use a function from an executable (which does not export symbols).

Means: gcc -Wl,--defsym,function=0x432238

The man page states that:

"--defsym symbol=expression" Create a global symbol in the output
file, containing the absolute address given by expression.

To my dismay, dlopen() is adding 0x7ffff676f000, the shared library's base address (this is 64-bit code) to the exported "absolute symbol address":

        executable        shared library
        ---------- linker --------------
symbol: 0x432238   =====> 0x7ffff6ba1238

objdump shows the correct symbol address (0x432238) in the library, but once loaded with dlopen(), the symbol has address 0x7ffff6ba1238.

If, once loaded, I manually patch the library symbol to the correct address then all works fine (else, the library SEGFAULTs).

  • Why the "absolute address" is modified?
  • How to avoid it?

Update:

I contest the technical relevance of the reply below, and, even more its 'update':

Having --defsym to define a relocated symbol in a PIC library/executable is pointless (it does not serve ANY purpose other than polluting the binary without any usable feature).

Therefore, the only relevant use of --defsym in a PIC shared library or PIC executable should be to define a (non-relocated) "absolute address".

Incidentally, that's the official purpose of --defsym if you bother to read the man page:

"Create a global symbol in the output file, containing the absolute address given by expression."

At best, this is a Linux linker deffect which would be trivial to fix. And for those who can't wait for the people-in-denial to realize (and fix) their mistake, the solution is to patch the relocation table after the binary image has been loaded by the defective linker.

Then, --defsym becomes useful in PIC libraries/executables, which seems to me is a welcome progress.

回答1:

You appear to have fundamentally misunderstood what --defsym does.

--defsym=symbol=expression
   Create a global symbol in the *output* file, ...

That is, you are creating the new symbol in the library that you are building. As such, the symbol is (naturally) relocated with the library.

I am guessing you want something like this instead:

// code in library
int fn()
{
    // exe_fn not exported from the executable, but we know where it is.
    int (*exe_fn)(void) = (int (*)(void)) 0x432238;
    return (*exe_fn)();
}

If you didn't want to hard-code 0x432238 into the library, and instead pass the value on command line at build time, just use a -DEXE_FN=0x432238 to achieve that.

Update:

Goal: a shared library to use a function from an executable

That goal can not be achieved by the method you selected. You'll have to use other means.

Why the "absolute address" is modified?

It isn't. When you ask the linker to define function at absolute address 0x432238, it does exactly that. You can see it in objdump, nm and readelf -s output.

But because the symbol is defined in the shared library, all references to that symbol are relocated, i.e. adjusted by the shared library load address (that is done by the dynamic loader). It makes no sense whatsoever for the dynamic loader to do otherwise.

How to avoid it?

You can't. Use other means to achieve your goal.