Unix Domain : connect() : No such file or director

2019-02-16 20:49发布

问题:

as stated in the title, my connect() call to an unix domain type socket with an according address results in the error ENOENT: no such file or directory.

The two sockets are properly initialized and the socket files are created and bound accordingly. The server and client sockets run in a different process, though the client process is fork()-ed and execl()-ed. This is also how I parse the address for the client and server socket, which I use for setting up the client socket. The server process is using pthreads.

Here is my connect() attempt:

    struct sockaddr_un address;
    address.sun_family = AF_UNIX;
    memcpy(address.sun_path, filepath.c_str(), filepath.length());
    address.sun_path[filepath.length()] = '\0';

    if(-1 == connect(this->unix_domain_descriptor_.descriptor(),       \
                     (struct sockaddr*)&address,                       \
                     size))
    {
        global::ExitDebug(-1, "connect() failed", __FILE__, __LINE__);
        return -1;
    }

I tried out different values for size, such as:

//  this is from unix(7) man page. It doesn't work neither with nor without "+1"
socklen_t size =  offsetof(struct sockaddr_un, sun_path);
          size += strlen(address.sun_path) + 1;

//  this is from one of my books about linux programming
socklen_t size = sizeof(address);

//  this is from a sample code which I found at the internet
socklen_t size = sizeof(address.sun_family) + strlen(address.sun_path);

//  Update 1: 
socklen_t size = SUN_LEN(&address);

//  this is what I tried out after looking into the declaration
//  of struct sockaddr_un
socklen_t size = strlen(address.sun_path);

Surprisingly, all initializations except the last one result in an EINVAL: invalid argument error for connect() and I get the ENOENT: no such file or directory only with the last one. I even tried out entire examples from the internet, but without success. And obviously, swapping socklen_t with size_t or int doesn't change anything.

I already checked for this:

  • address.sun_path contains the correct socket file path starting from the root directory
  • address.sun_path has the length of 61 characters
  • address.sun_family is set to AF_UNIX/AF_LOCAL
  • address.sun_family has the size of 2 bytes
  • no errors at creating and binding both sockets
  • server socket is in listening state
  • sizeof(address) returns 110 as it is supposed to be

Now I was wondering why the man page example was not working and whether there had been changes that were not updated at linux.die.net or www.kernel.org. My OS is Debian Squeeze if it's from relevant.

Any ideas what I am doing wrong? And how to solve it? If you need more code or have questions, then don't hesitate to ask me (though I don't need to state this probably, but this is my first post here >.<).

btw, sorry for my bad english

Update 2

Solved. I will post it in an extra answer below out of clarity.

回答1:

After figuring out that I was handling the sockets properly, I altered my code for connect() a little and now it works. I just added this line after the declaration of my variable:

memset(&address, 0, sizeof(struct sockaddr_un));

Does anyone know why I need to set the whole variable to 0 to make it work? Should I ask this in a new topic or can I ask this here?



回答2:

Quoting from the glibc manual:

You should compute the LENGTH parameter for a socket address in the local namespace as the sum of the size of the sun_family component and the string length (not the allocation size!) of the file name string. This can be done using the macro SUN_LEN:

  • Macro: int SUN_LEN (_struct sockaddr_un *_ PTR)
    The macro computes the length of socket address in the local namespace.

The example which follows uses a computation for which you say it fails for you:

size = (offsetof (struct sockaddr_un, sun_path)
       + strlen (name.sun_path) + 1);

But you should try out that macro. If something changed, or the example is wrong, there is still a good chance that that macro works as intended. If it does, you can look at its innards. At a first glance, it seems to me that the macro lacks the + 1 part used in all the examples. Which matches the warning from the manual to use “not the allocation size!” As your post says this didn't work without + 1 either, chances are slim though.

Out of curiosity, what's the length of the path? Have you checked whether the field provided in the structure is large enough to hold it? What is sizeof(address.sun_path) in your implementation? I wonder whether you might be copying to unreserved memory, and part of the path gets overwritten at the next function call.