Does anyone have any good explanations, tutorials, books, or guides on the use of PTRACE_SYSEMU?
问题:
回答1:
What I found interesting:
- Example Implementation for ptrace
- Playing with ptrace, Part I - LinuxJournal.com
- Playing with ptrace, Part II - LinuxJournal.com
And programming library that makes using ptrace easier :
- PinkTrace - ptrace() wrapper library.
For pinktrace there are examples, sydbox sources are example of complex pinktrace usecase. In general, I've found author as good person to contact about using and testing pinktrace.
回答2:
There is small test from linux kernel sources which uses PTRACE_SYSEMU:
http://code.metager.de/source/xref/linux/stable/tools/testing/selftests/x86/ptrace_syscall.c or http://lxr.free-electrons.com/source/tools/testing/selftests/x86/ptrace_syscall.c
186 struct user_regs_struct regs;
187
188 printf("[RUN]\tSYSEMU\n");
189 if (ptrace(PTRACE_SYSEMU, chld, 0, 0) != 0)
190 err(1, "PTRACE_SYSCALL");
191 wait_trap(chld);
192
193 if (ptrace(PTRACE_GETREGS, chld, 0, ®s) != 0)
194 err(1, "PTRACE_GETREGS");
195
196 if (regs.user_syscall_nr != SYS_gettid ||
197 regs.user_arg0 != 10 || regs.user_arg1 != 11 ||
198 regs.user_arg2 != 12 || regs.user_arg3 != 13 ||
199 regs.user_arg4 != 14 || regs.user_arg5 != 15) {
200 printf("[FAIL]\tInitial args are wrong (nr=%lu, args=%lu %lu %lu %lu %lu %lu)\n", (unsigned long)regs.user_syscall_nr, (unsigned long)regs.user_arg0, (unsigned long)regs.user_arg1, (unsigned long)regs.user_arg2, (unsigned long)regs.user_arg3, (unsigned long)regs.user_arg4, (unsigned long)regs.user_arg5);
201 nerrs++;
202 } else {
203 printf("[OK]\tInitial nr and args are correct\n");
204 }
205
206 printf("[RUN]\tRestart the syscall (ip = 0x%lx)\n",
207 (unsigned long)regs.user_ip);
208
209 /*
210 * This does exactly what it appears to do if syscall is int80 or
211 * SYSCALL64. For SYSCALL32 or SYSENTER, though, this is highly
212 * magical. It needs to work so that ptrace and syscall restart
213 * work as expected.
214 */
215 regs.user_ax = regs.user_syscall_nr;
216 regs.user_ip -= 2;
217 if (ptrace(PTRACE_SETREGS, chld, 0, ®s) != 0)
218 err(1, "PTRACE_SETREGS");
219
220 if (ptrace(PTRACE_SYSEMU, chld, 0, 0) != 0)
221 err(1, "PTRACE_SYSCALL");
222 wait_trap(chld);
223
224 if (ptrace(PTRACE_GETREGS, chld, 0, ®s) != 0)
225 err(1, "PTRACE_GETREGS");
226
So, it looks like just another ptrace
call which will allow program to run until next system call is made by it; then stop child and signal the ptracer. It can read registers, optionally change some and restart the syscall.
Implemented in http://lxr.free-electrons.com/source/kernel/ptrace.c?v=4.10#L1039 like other stepping ptrace
calls:
1039 #ifdef PTRACE_SINGLESTEP
1040 case PTRACE_SINGLESTEP:
1041 #endif
1042 #ifdef PTRACE_SINGLEBLOCK
1043 case PTRACE_SINGLEBLOCK:
1044 #endif
1045 #ifdef PTRACE_SYSEMU
1046 case PTRACE_SYSEMU:
1047 case PTRACE_SYSEMU_SINGLESTEP:
1048 #endif
1049 case PTRACE_SYSCALL:
1050 case PTRACE_CONT:
1051 return ptrace_resume(child, request, data);
And man page has some info: http://man7.org/linux/man-pages/man2/ptrace.2.html
PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP (since Linux 2.6.14) For PTRACE_SYSEMU, continue and stop on entry to the next system call, which will not be executed. See the documentation on syscall-stops below. For PTRACE_SYSEMU_SINGLESTEP, do the same but also singlestep if not a system call. This call is used by programs like User Mode Linux that want to emulate all the tracee's system calls. The data argument is treated as for PTRACE_CONT. The addr argument is ignored. These requests are currently supported only on x86.
So, it is not portable and used only for Usermode linux (um) on x86 platform as variant of classic PTRACE_SYSCALL
. And um test for sysemu with some comments is here: http://lxr.free-electrons.com/source/arch/um/os-Linux/start_up.c?v=4.10#L155
155 __uml_setup("nosysemu", nosysemu_cmd_param,
156 "nosysemu\n"
157 " Turns off syscall emulation patch for ptrace (SYSEMU) on.\n"
158 " SYSEMU is a performance-patch introduced by Laurent Vivier. It changes\n"
159 " behaviour of ptrace() and helps reducing host context switch rate.\n"
160 " To make it working, you need a kernel patch for your host, too.\n"
161 " See http://perso.wanadoo.fr/laurent.vivier/UML/ for further \n"
162 " information.\n\n");
163
164 static void __init check_sysemu(void)
Link in comment was redirecting to secret site http://sysemu.sourceforge.net/ from 2004:
Why ?
UML uses ptrace() and PTRACE_SYSCALL to catch system calls. But, by this way, you can't remove the real system call, only monitor it. UML, to avoid the real syscall and emulate it, replaces the real syscall by a call to getpid(). This method generates two context-switches instead of one. A Solution
A solution is to change the behaviour of ptrace() to not call the real syscall and thus we don't have to replace it by a call to getpid(). How ?
By adding a new command to ptrace(), PTRACE_SYSEMU, that acts like PTRACE_SYSCALL without executing syscall. To add this command we need to patch the host kernel. To use this new command in UML kernel, we need to patch the UML kernel too.