I have a thread that sits in a blocking recv()
loop and I want to terminate (assume this can't be changed to select()
or any other asynchronous approach).
I also have a signal handler that catches SIGINT
and theoretically it should make recv()
return with error and errno
set to EINTR
.
But it doesn't, which I assume has something to do with the fact that the application is multi-threaded. There is also another thread, which is meanwhile waiting on a pthread_join()
call.
What's happening here?
EDIT:
OK, now I explicitly deliver the signal to all blocking recv()
threads via pthread_kill()
from the main thread (which results in the same global SIGINT
signal handler installed, though multiple invocations are benign). But recv()
call is still not unblocked.
EDIT:
I've written a code sample that reproduces the problem.
- Main thread connects a socket to a misbehaving remote host that won't let the connection go.
- All signals blocked.
- Read thread thread is started.
- Main unblocks and installs handler for
SIGINT
. - Read thread unblocks and installs handler for
SIGUSR1
. - Main thread's signal handler sends a
SIGUSR1
to the read thread.
Interestingly, if I replace recv()
with sleep()
it is interrupted just fine.
PS
Alternatively you can just open a UDP socket instead of using a server.
client
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <errno.h>
static void
err(const char *msg)
{
perror(msg);
abort();
}
static void
blockall()
{
sigset_t ss;
sigfillset(&ss);
if (pthread_sigmask(SIG_BLOCK, &ss, NULL))
err("pthread_sigmask");
}
static void
unblock(int signum)
{
sigset_t ss;
sigemptyset(&ss);
sigaddset(&ss, signum);
if (pthread_sigmask(SIG_UNBLOCK, &ss, NULL))
err("pthread_sigmask");
}
void
sigusr1(int signum)
{
(void)signum;
printf("%lu: SIGUSR1\n", pthread_self());
}
void*
read_thread(void *arg)
{
int sock, r;
char buf[100];
unblock(SIGUSR1);
signal(SIGUSR1, &sigusr1);
sock = *(int*)arg;
printf("Thread (self=%lu, sock=%d)\n", pthread_self(), sock);
r = 1;
while (r > 0)
{
r = recv(sock, buf, sizeof buf, 0);
printf("recv=%d\n", r);
}
if (r < 0)
perror("recv");
return NULL;
}
int sock;
pthread_t t;
void
sigint(int signum)
{
int r;
(void)signum;
printf("%lu: SIGINT\n", pthread_self());
printf("Killing %lu\n", t);
r = pthread_kill(t, SIGUSR1);
if (r)
{
printf("%s\n", strerror(r));
abort();
}
}
int
main()
{
pthread_attr_t attr;
struct sockaddr_in addr;
printf("main thread: %lu\n", pthread_self());
memset(&addr, 0, sizeof addr);
sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (socket < 0)
err("socket");
addr.sin_family = AF_INET;
addr.sin_port = htons(8888);
if (inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr) <= 0)
err("inet_pton");
if (connect(sock, (struct sockaddr *)&addr, sizeof addr))
err("connect");
blockall();
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
if (pthread_create(&t, &attr, &read_thread, &sock))
err("pthread_create");
pthread_attr_destroy(&attr);
unblock(SIGINT);
signal(SIGINT, &sigint);
if (sleep(1000))
perror("sleep");
if (pthread_join(t, NULL))
err("pthread_join");
if (close(sock))
err("close");
return 0;
}
server
import socket
import time
s = socket.socket(socket.AF_INET)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('127.0.0.1',8888))
s.listen(1)
c = []
while True:
(conn, addr) = s.accept()
c.append(conn)
As alluded to in the post by
<R..
>, it is indeed possible to change the signal activities. I often create my own "signal" function that makes use of sigaction. Here's what I useThe attribute in question above is the 'or'ing of the sa_flags field. This is from the man page for 'sigaction': SA_RESTART provides the BSD-like behavior of allowing system calls to be restartable across signals. SA_NODEFER means allow the signal to be received from within its own signal handler.
When the signal calls are replaced with "_signal", the thread is interrupted. The output prints out "interrupted system call" and recv returned a -1 when SIGUSR1 was sent. The program stopped altogether with the same output when SIGINT was sent, but the abort was called at the end.
I did not write the server portion of the code, I just changed the socket type to "DGRAM, UDP" to allow the client to start.
You can set a timeout on Linux recv: Linux: is there a read or recv from socket with timeout?
When you get a signal, call done on the class doing the receive.
Does signal handler invoked in same thread which waits in recv()? You may need to explicitly mask SIGINT in all other threads via pthread_sigmask()
In a multi-threaded application, normal signals can be delivered to any thread arbitrarily. Use
pthread_kill
to send the signal to the specific thread of interest.Normally signals do not interrupt system calls with
EINTR
. Historically there were two possible signal delivery behaviors: the BSD behavior (syscalls are automatically restarted when interrupted by a signal) and the Unix System V behavior (syscalls return -1 witherrno
set toEINTR
when interrupted by a signal). Linux (the kernel) adopted the latter, but the GNU C library developers (correctly) deemed the BSD behavior to be much more sane, and so on modern Linux systems, callingsignal
(which is a library function) results in the BSD behavior.POSIX allows either behavior, so it's advisable to always use
sigaction
where you can choose to set theSA_RESTART
flag or omit it depending on the behavior you want. See the documentation forsigaction
here:http://www.opengroup.org/onlinepubs/9699919799/functions/sigaction.html