Is the data in siginfo trustworthy?

2019-03-27 07:33发布

问题:

I've found that on Linux, by making my own call to the rt_sigqueue syscall, I can put whatever I like in the si_uid and si_pid fields and the call succeeds and happily delivers the incorrect values. Naturally the uid restrictions on sending signals provide some protection against this kind of spoofing, but I'm worried it may be dangerous to rely on this information. Is there any good documentation on the topic I could read? Why does Linux allow the obviously-incorrect behavior of letting the caller specify the siginfo parameters rather than generating them in kernelspace? It seems nonsensical, especially since extra sys calls (and thus performance cost) may be required in order to get the uid/gid in userspace.

Edit: Based on my reading of POSIX (emphasis added by me):

If si_code is SI_USER or SI_QUEUE, [XSI] or any value less than or equal to 0, then the signal was generated by a process and si_pid and si_uid shall be set to the process ID and the real user ID of the sender, respectively.

I believe this behavior by Linux is non-conformant and a serious bug.

回答1:

That section of the POSIX page you quote also lists what si-code means, and here's the meaning:

SI_QUEUE
    The signal was sent by the sigqueue() function.

That section goes on to say:

If the signal was not generated by one of the functions or events listed above, si_code shall be set either to one of the signal-specific values described in XBD , or to an implementation-defined value that is not equal to any of the values defined above.

Nothing is violated if only the sigqueue() function uses SI_QUEUE. Your scenario involves code other than the sigqueue() function using SI_QUEUE The question is whether POSIX envisions an operating system enforcing that only a specified library function (as opposed to some function which is not a POSIX-defined library function) be permitted to make a system call with certain characteristics. I believe the answer is "no".

EDIT as of 2011-03-26, 14:00 PST:

This edit is in response to R..'s comment from eight hours ago, since the page wouldn't let me leave an adequately voluminous comment:

I think you're basically right. But either a system is POSIX compliant or it is not. If a non-library function does a syscall which results in a non-compliant combination of uid, pid, and 'si_code', then the second statement I quoted makes it clear that the call itself is not compliant. One can interpret this in two ways. One ways is: "If a user breaks this rule, then he makes the system non-compliant." But you're right, I think that's silly. What good is a system when any nonprivileged user can make it noncompliant? The fix, as I see it, is somehow to have the system know that it's not the library 'sigqueue()' making the system call, then the kernel itself should set 'si_code' to something other than 'SI_QUEUE', and leave the uid and pid as you set them. In my opinion, you should raise this with the kernel folks. They may have difficulty, however; I don't know of any secure way for them to detect whether a syscall is made by a particular library function, seeing as how the library functions. almost by definition, are merely convenience wrappers around the syscalls. And that may be the position they take, which I know will be a disappointment.

(voluminous) EDIT as of 2011-03-26, 18:00 PST:

Again because of limitations on comment length.

This is in response to R..'s comment of about an hour ago.

I'm a little new to the syscall subject, so please bear with me.

By "the kernel sysqueue syscall", do you mean the `__NR_rt_sigqueueinfo' call? That's the only one that I found when I did this:

grep -Ri 'NR.*queue' /usr/include

If that's the case, I think I'm not understanding your original point. The kernel will let (non-root) me use SI-QUEUE with a faked pid and uid without error. If I have the sending side coded thus:

#include <sys/syscall.h>
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int    argc,
         char **argv
        )
{
  long john_silver;

  siginfo_t my_siginfo;

  if(argc!=2)
  {
    fprintf(stderr,"missing pid argument\n");

    exit(1);
  }

  john_silver=strtol(argv[1],NULL,0);

  if(kill(john_silver,SIGUSR1))
  {
    fprintf(stderr,"kill() fail\n");

    exit(1);
  }

  sleep(1);

  my_siginfo.si_signo=SIGUSR1;
  my_siginfo.si_code=SI_QUEUE;
  my_siginfo.si_pid=getpid();
  my_siginfo.si_uid=getuid();
  my_siginfo.si_value.sival_int=41;

  if(syscall(__NR_rt_sigqueueinfo,john_silver,SIGUSR1,&my_siginfo))
  {
    perror("syscall()");

    exit(1);
  }

  sleep(1);

  my_siginfo.si_signo=SIGUSR2;
  my_siginfo.si_code=SI_QUEUE;
  my_siginfo.si_pid=getpid()+1;
  my_siginfo.si_uid=getuid()+1;
  my_siginfo.si_value.sival_int=42;

  if(syscall(__NR_rt_sigqueueinfo,john_silver,SIGUSR2,&my_siginfo))
  {
    perror("syscall()");

    exit(1);
  }

  return 0;

} /* main() */

and the receiving side coded thus:

#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int signaled_flag=0;

siginfo_t received_information;

void
my_handler(int        signal_number,
           siginfo_t *signal_information,
           void      *we_ignore_this
          )
{
  memmove(&received_information,
          signal_information,
          sizeof(received_information)
         );

  signaled_flag=1;

} /* my_handler() */

/*--------------------------------------------------------------------------*/

int
main(void)
{
  pid_t            myself;

  struct sigaction the_action;

  myself=getpid();

  printf("signal receiver is process %d\n",myself);

  the_action.sa_sigaction=my_handler;
  sigemptyset(&the_action.sa_mask);
  the_action.sa_flags=SA_SIGINFO;

  if(sigaction(SIGUSR1,&the_action,NULL))
  {
    fprintf(stderr,"sigaction(SIGUSR1) fail\n");

    exit(1);
  }

  if(sigaction(SIGUSR2,&the_action,NULL))
  {
    fprintf(stderr,"sigaction(SIGUSR2) fail\n");

    exit(1);
  }

  for(;;)
  {
    while(!signaled_flag)
    {
      sleep(1);
    }

    printf("si_signo: %d\n",received_information.si_signo);
    printf("si_pid  : %d\n",received_information.si_pid  );
    printf("si_uid  : %d\n",received_information.si_uid  );

    if(received_information.si_signo==SIGUSR2)
    {
      break;
    }

    signaled_flag=0;
  }

  return 0;

} /* main() */

I can then run (non-root) the receiving side thus:

wally:~/tmp/20110326$ receive
signal receiver is process 9023
si_signo: 10
si_pid  : 9055
si_uid  : 4000
si_signo: 10
si_pid  : 9055
si_uid  : 4000
si_signo: 12
si_pid  : 9056
si_uid  : 4001
wally:~/tmp/20110326$ 

And see this (non-root) on the send end:

wally:~/tmp/20110326$ send 9023
wally:~/tmp/20110326$ 

As you can see, the third event has spoofed pid and uid. Isn't that what you originally objected to? There's no EINVAL or EPERM in sight. I guess I'm confused.



回答2:

I agree that si_uid and si_pid should be trustworthy, and if they are not it is a bug. However, this is only required if the signal is SIGCHLD generated by a state change of a child process, or if si_code is SI_USER or SI_QUEUE, or if the system supports the XSI option and si_code <= 0. Linux/glibc also pass si_uid and si_pid values in other cases; these are often not trustworthy but that is not a POSIX conformance issue.

Of course, for kill() the signal may not be queued in which case the siginfo_t does not provide any additional information.

The reason that rt_sigqueueinfo allows more than just SI_QUEUE is probably to allow implementing POSIX asynchronous I/O, message queues and per-process timers with minimal kernel support. Implementing these in userland requires the ability to send a signal with SI_ASYNCIO, SI_MESGQ and SI_TIMER respectively. I do not know how glibc allocates the resources to queue the signal beforehand; to me it looks like it does not and simply hopes rt_sigqueueinfo does not fail. POSIX clearly forbids discarding a timer expiration (async I/O completion, message arrival on a message queue) notification because too many signals are queued at the time of the expiration; the implementation should have rejected the creation or registration if there were insufficient resources. The objects have been defined carefully such that each I/O request, message queue or timer can have at most one signal in flight at a time.