Experimenting with the ptrace()
system call, I am trying to trace another thread of the same process. According to the man page, both the tracer and the tracee are specific threads (not processes), so I don't see a reason why it should not work. So far, I have tried the following:
- use
PTRACE_TRACEME
from the clone()
d child: the call succeeds, but does not do what I want, probably because the parent of the to-be-traced thread is not the thread that called clone()
- use
PTRACE_ATTACH
or PTRACE_SEIZE
from the parent thread: this always fails with EPERM
, even if the process runs as root and with prctl(PR_SET_DUMPABLE, 1)
In all cases, waitpid(-1, &status, __WALL)
fails with ECHILD
(same when passing the child pid explicitly).
What should I do to make it work?
If it is not possible at all, is it by desing or a bug in the kernel (I am using version 3.8.0). In the former case, could you point me to the right bit of the documentation?
As @mic_e pointed out, this is a known fact about the kernel - not quite a bug, but not quite correct either. See the kernel mailing list thread about it. To provide an excerpt from Linus Torvalds:
That "new" (last November) check isn't likely going away. It solved
so many problems (both security and stability), and considering that
(a) in a year, only two people have ever even noticed
(b) there's a work-around as per above that isn't horribly invasive
I have to say that in order to actually go back to the old behaviour,
we'd have to have somebody who cares deeply, go back and check every
single special case, deadlock, and race.
The solution is to actually start the process that is being traced in a subprocess - you'll need to make the ptracing process be the parent of the other.
Here's an outline of doing this based on another answer that I wrote:
// this number is arbitrary - find a better one.
#define STACK_SIZE (1024 * 1024)
int main_thread(void *ptr) {
// do work for main thread
}
int main(int argc, char *argv[]) {
void *vstack = malloc(STACK_SIZE);
pid_t v;
if (clone(main_thread, vstack + STACK_SIZE, CLONE_PARENT_SETTID | CLONE_FILES | CLONE_FS | CLONE_IO, NULL, &v) == -1) { // you'll want to check these flags
perror("failed to spawn child task");
return 3;
}
long ptv = ptrace(PTRACE_SEIZE, v, NULL, NULL);
if (ptv == -1) {
perror("failed monitor sieze");
return 1;
}
// do actual ptrace work
}