RPC can't decode arguments for TCP transport

2019-02-24 07:09发布

问题:

I'm working on a multithreaded RPC server based on the example from this page: http://bderzhavets.blogspot.ca/2005/11/multithreaded-rpc-server-in-white-box.html

Unfortunately it didn't quite work out of the box, and after chasing the errors for some time, I've found that the server is failing to decode the arguments (based on the return code from squareproc_2). The execution on the server side seems to stop after the call to squareproc_2_svc in the function serv_request. See case: SQUAREPROC in the code below from square_svc.c

void *serv_request(void *data)
{
    struct thr_data *ptr_data = (struct thr_data *)data;
    {
        square_in argument;
        square_out result;
        bool_t retval;
        xdrproc_t _xdr_argument, _xdr_result;
        bool_t (*local)(char *, void *, struct svc_req *);
        struct svc_req *rqstp = ptr_data->rqstp;
        register SVCXPRT *transp = ptr_data->transp;
        switch (rqstp->rq_proc) {
            case NULLPROC:
                printf("NULLPROC called\n");
                (void) svc_sendreply (transp, (xdrproc_t) xdr_void, (char *)NULL);
                return;
            case SQUAREPROC:
                _xdr_argument = (xdrproc_t) xdr_square_in;
                _xdr_result = (xdrproc_t) xdr_square_out;
                printf("_xdr_result = %ld\n",_xdr_result);
                local = (bool_t (*) (char *, void *,  struct svc_req *))squareproc_2_svc;
                break;
            default:
                printf("default case executed");
                svcerr_noproc (transp);
                return;
        }
        memset ((void *)&argument, 0, sizeof (argument));
        if (!svc_getargs (transp, (xdrproc_t) _xdr_argument, (caddr_t) &argument)) {
            printf("svc_getargs failed");
            svcerr_decode (transp);
            return;
        }
        retval = (bool_t) (*local)((char *)&argument, (void *)&result, rqstp);
        printf("serv_request result: %d\n",retval);
        if (retval > 0 && !svc_sendreply(transp, (xdrproc_t) _xdr_result, (char *)&result))
        {
            printf("something happened...\n");
            svcerr_systemerr (transp);
        }
        if (!svc_freeargs (transp, (xdrproc_t) _xdr_argument, (caddr_t) &argument)) {
            fprintf (stderr, "%s", "unable to free arguments");
            exit (1);
        }
        if (!square_prog_2_freeresult (transp, _xdr_result, (caddr_t) &result))
            fprintf (stderr, "%s", "unable to free results");
        return;
    }
}

Here is the implementation of squareproc_2_svc from the file square_server.c:

bool_t squareproc_2_svc(square_in *inp,square_out *outp,struct svc_req *rqstp)
{
    printf("Thread id = '%ld' started, arg = %ld\n",pthread_self(),inp->arg1);
    sleep(5);
    outp->res1=inp->arg1*inp->arg1;
    printf("Thread id = '%ld' is done %ld \n",pthread_self(),outp->res1);
    return(TRUE);
}

Client side output:

yak@AcerPC:~/RPC/multithread_example$ ./ClientSQUARE localhost 2
squareproc_2 called
xdr_square_in result: 1
function call failed; code: 11

Server side output:

yak@AcerPC:~/RPC/multithread_example$ sudo ./ServerSQUARE 
creating threads
SQUAREPROC called
xdr_square_in result: 0

As you can see, xdr_square_in returns a FALSE result on the server side. Here is the square.x

struct square_in {
    long arg1;
};

struct square_out {
    long res1;
};

program SQUARE_PROG {
    version SQUARE_VERS {
        square_out SQUAREPROC(square_in) = 1;
    } = 2 ;
} = 0x31230000;

and square_xdr.c

/*
 * Please do not edit this file.
 * It was generated using rpcgen.
 */

#include "square.h"

bool_t
xdr_square_in (XDR *xdrs, square_in *objp)
{
    register int32_t *buf;
    int retval;
    if (!xdr_long (xdrs, &objp->arg1)) retval = FALSE;
    else retval = TRUE;
    printf("xdr_square_in result: %d\n",retval);
    return retval;
}

bool_t
xdr_square_out (XDR *xdrs, square_out *objp)
{
    register int32_t *buf;
    int retval;
    if (!xdr_long (xdrs, &objp->res1)) retval = FALSE;
    else retval = TRUE;
    printf("xdr_square_out result: %d\n",retval);
    return retval;
}

I'm working in Ubuntu 14.04 LTS, generating stubs and xdr code with rpcgen -a -M, and compiling with gcc.

The error only seems to occur when using TCP as the transport method. I can get results using UDP as the transport, but some calls fail when requests from multiple clients arrive simultaneously. I would like to be able to support up to 15 clients. When I tried using UDP and 10 clients, 2 of the 10 calls failed with a different return code from squareproc_2.

回答1:

You've got a few issues.

From the xen page, when it does the pthread_create in square_prog_2, it first calls pthread_attr_setdetachstate, but it needs to do pthread_attr_init before that. Also, attr appears to be static/global--put it in the function's stack frame.

square_prog_2 gets two args: rqstp and transp. These get saved into a malloc'ed struct data_str [so each thread has their own copy]. But, I wonder what the rqstp and transp values are (e.g. printf("%p")). They need to different or each thread will collide with each other when trying to use them [thus needing pthread_mutex_lock]. The malloc doesn't clone rqstp/transp so if they are the same, that's the issue because you may have two threads trying to riff on the same buffers simultaneously.

There is a return code of 11. Barring some special code, that looks suspiciously like SIGSEGV on a thread. This would be completely accounted for by the rqstp/transp overlap.

You may need to rearchitect this as I suspect XDR is not thread safe--nor should it need to be. Also, I don't think svc_* is thread safe/aware.

Start single threaded. As a test, have square_prog_2 call serv_request directly (e.g. do not do pthread_*). I bet that works in all modes.

If so, hold onto your hat--the example code using threads is broken--full of race conditions and will segfault, etc. If you're not hung up on using threads (no need for such a light duty task as x * x), you can just enjoy as is.

Otherwise, the solution is a bit more sophisticated. The main thread must do all the access to the socket and all XDR parsing/encoding. It can't use svc_run--you have to roll your own. The child can only do the actual work (e.g. x * x) and may not touch the socket/req/transp, etc.

Main thread:

while (1) {
    if (svc_getreq_poll()) {
        // parse XDR
        // create data/return struct for child thread
        // create thread
        // add struct to list of "in-flight" requests
    }

    forall struct in inflight {
        if (reqdone) {
            // take result from struct
            // encode into XDR
            // do send_reply
            // remove struct from list
        }
    }
}

For the child struct it would look like:

struct child_struct {
    int num;
    int num_squared;
};

And the child's thread function becomes a one liner:ptr->num_squared = ptr->num * ptr->num

UPDATE: Multithread RPC servers appear to not be supported under Linux or FreeBSD

Here's a document: https://www.redhat.com/archives/redhat-list/2004-June/msg00439.html This has a cleaner example to start from.

From that: Remember -A option of rpcgen is not supported under Linux. Library calls providing by SunOS RPC to build Multithreaded RPC Server are unavailable under Linux as well

Here's the Linux rpcgen man page: http://linux.die.net/man/1/rpcgen No mention of -M. IMO, this means the rpcgen program has the option and does generate the stubs, but the underlying support is not there, so they left it out of the doc.

Here's the FreeBSD man page [and the reason why there's no support]: http://www.freebsd.org/cgi/man.cgi?query=rpcgen&sektion=1&manpath=FreeBSD+5.0-RELEASE See the doc for -M within this:

M -- Generate multithread-safe stubs for passing arguments and results between rpcgen generated code and user written code. This option is useful for users who want to use threads in their code. However, the rpc_svc_calls(3) functions are not yet MT-safe, which means that rpcgen generated server-side code will not be MT-safe.

An alternate way:

Why bother with RPC/XDR at all? The overhead is huge for the large arrays you intend to use. Most of the standard uses are for things like yellow pages, that don't have much data.

Most systems are little endian these days. Just blast the native buffer to a socket that you open directly. On the server, have a daemon do a listen, then fork a child and have the child do the accept, read in the data, do the calculations, and send back the reply. At worst, the child will need to do an endian swap but that's easily done in a tight loop using bswap_32.

A simple little control struct at the beginning of each message in either direction that prefixes the data payload:

struct msgcontrol {
    int what_i_am;
    int operation_to_perform;
    int payload_length;
    int payload[0];
};

A special note: I've done this commercially before (e.g. MPI and roll my own) and you may have to issue setsockopt calls to increase the size of the kernel socket buffer to something large enough to sustain a barrage of data

Actually, now that I think of it, if you don't want to roll your own, MPI may be of interest. However, having used it, I'm not a true fan. It had unexpected problems and we had to remove it in favor of controlling our sockets directly.