Unix domain socket

Summary

In client-server computing, a Unix domain socket is a Berkeley socket that allows data to be exchanged between two processes executing on the same Unix or Unix-like host computer.[1] This is similar to an Internet domain socket that allows data to be exchanged between two processes executing on different host computers.

Regardless of the range of communication (same host or different host),[2] Unix computer programs that perform socket communication are similar. The only range of communication difference is the method to convert a name to the address parameter needed to bind the socket's connection. For a Unix domain socket, the name is a /path/filename. For an Internet domain socket, the name is an IP address:Port number. In either case, the name is called an address.[3]

Two processes may communicate with each other if each obtains a socket. The server process binds its socket to an address, opens a listen channel, and then continuously loops. Inside the loop, the server process is put to sleep while waiting to accept a client connection.[4] Upon accepting a client connection, the server then executes a read system call that will block wait. The client connects to the server's socket via the server's address. The client process then writes a message for the server process to read. The application's algorithm may entail multiple read/write interactions. Upon completion of the algorithm, the client executes exit()[5] and the server executes close().[6]

For a Unix domain socket, the socket's address is a /path/filename identifier. The server will create /path/filename on the filesystem to act as a lock file semaphore. No I/O occurs on this file when the client and server send messages to each other.[7]

History

edit

Sockets first appeared in Berkeley Software Distribution 4.2 (1983).[8] It became a POSIX standard in 2000.[8] The application programming interface has been ported to virtually every Unix implementation and most other operating systems.[8]

Socket instantiation

edit

Both the server and the client must instantiate a socket object by executing the socket() system call. Its usage is:[9]

int socket( int domain, int type, int protocol );

The domain parameter should be one of the following common ranges of communication:[10]

  1. Within the same host by using the constant AF_UNIX[a]
  2. Between two hosts via the IPv4 protocol by using the constant AF_INET
  3. Between two hosts via the IPv6 protocol by using the constant AF_INET6
  4. Within the same host or between two hosts via the Stream Control Transmission Protocol by using the constant SOCK_SEQPACKET[11]

The Unix domain socket label is used when the domain parameter's value is AF_UNIX. The Internet domain socket label is used when the domain parameter's value is either AF_INET or AF_INET6.[12]

The type parameter should be one of two common socket types: stream or datagram.[10] A third socket type is available for experimental design: raw.

  1. SOCK_STREAM will create a stream socket. A stream socket provides a reliable, bidirectional, and connection-oriented communication channel between two processes. Data are carried using the Transmission Control Protocol (TCP).[10]
  2. SOCK_DGRAM will create a datagram socket.[b] A Datagram socket does not guarantee reliability and is connectionless. As a result, the transmission is faster. Data are carried using the User Datagram Protocol (UDP).[14]
  3. SOCK_RAW will create an Internet Protocol (IP) datagram socket. A Raw socket skips the TCP/UDP transport layer and sends the packets directly to the network layer.[15]

For a Unix domain socket, data (network packets) are passed between two connected processes via the transport layer — either TCP or UDP.[16] For an Internet domain socket, data are passed between two connected processes via the transport layer and the Internet Protocol (IP) of the network layer — either TCP/IP or UDP/IP.[16]

The protocol parameter should be set to zero for stream and datagram sockets.[2] For raw sockets, the protocol parameter should be set to IPPROTO_RAW.[9]

socket() return value

edit
socket_fd = socket( int domain, int type, int protocol );

Like the regular-file open() system call, the socket() system call returns a file descriptor.[2][c] The return value's suffix _fd stands for file descriptor.

Server bind to /path/filename

edit

After instantiating a new socket, the server binds the socket to an address. For a Unix domain socket, the address is a /path/filename.

Because the socket address may be either a /path/filename or an IP_address:Port_number, the socket application programming interface requires the address to first be set into a structure. For a Unix domain socket, the structure is:[17]

struct sockaddr_un {
    sa_family_t sun_family; /* AF_UNIX */
    char sun_path[ 92 ];
}

The _un suffix stands for unix. For an Internet domain socket, the suffix will be either _in or _in6. The sun_ prefix stands for socket unix.[17]

Computer program to create and bind a stream Unix domain socket:[7]

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>

/* Should be 91 characters or less. Some Unix-like are slightly more. */
/* Use /tmp directory for demonstration only. */ 
char *socket_address = "/tmp/mysocket.sock";

void main( void )
{
    int server_socket_fd;
    struct sockaddr_un sockaddr_un = {0};
    int return_value;

    server_socket_fd = socket( AF_UNIX, SOCK_STREAM, 0 );
    if ( server_socket_fd == -1 ) assert( 0 );

    /* Remove (maybe) a prior run. */
    remove( socket_address );

    /* Construct the bind address structure. */
    sockaddr_un.sun_family = AF_UNIX;
    strcpy( sockaddr_un.sun_path, socket_address );

    return_value =
        bind(
            server_socket_fd,
            (struct sockaddr *) &sockaddr_un,
            sizeof( struct sockaddr_un ) );

    /* If socket_address exists on the filesystem, then bind will fail. */
    if ( return_value == -1 ) assert( 0 );

    /* Listen and accept code omitted. */
}

The second parameter for bind() is a pointer to struct sockaddr. However, the parameter passed to the function is the address of a struct sockaddr_un. struct sockaddr is a generic structure that is not used. It is defined in the formal parameter declaration for bind(). Because each range of communication has its own actual parameter, this generic structure was created as a cast placeholder.[18]

Server listen for a connection

edit

After binding to an address, the server opens a listen channel to a port by executing listen(). Its usage is:[19]

int listen( int server_socket_fd, int backlog );

Snippet to listen:

if ( listen( server_socket_fd, 4096 ) == -1 ) assert( 0 );

For a Unix domain socket, listen() most likely will succeed and return 0. For an Internet domain socket, if the port is in use, listen() returns -1.[19]

The backlog parameter sets the queue size for pending connections.[20] The server may be busy when a client executes a connect() request. Connection requests up to this limit will succeed. If the backlog value passed in exceeds the default maximum, then the maximum value is used.[19]

Server accept a connection

edit

After opening a listen channel, the server enters an infinite loop. Inside the loop is a system call to accept(), which puts itself to sleep.[4] The accept() system call will return a file descriptor when a client process executes connect().[21]

Snippet to accept a connection:

int accept_socket_fd;

while ( 1 )
{
    accept_socket_fd = accept( server_socket_fd, NULL, NULL );
    if ( accept_socket_fd == -1 ) assert( 0 );

    if ( accept_socket_fd ) > 0 ) /* client is connected */
}

Server I/O on a socket

edit

When accept() returns a positive integer, the server engages in an algorithmic dialog with the client.

Stream socket input/output may execute the regular-file system calls of read() and write().[6] However, more control is available if a stream socket executes the socket-specific system calls of send() and recv(). Alternatively, datagram socket input/output should execute the socket-specific system calls of sendto() and recvfrom().[22]

For a basic stream socket, the server receives data with read( accept_socket_fd ) and sends data with write( accept_socket_fd ).

Snippet to illustrate I/O on a basic stream socket:

int accept_socket_fd;

while ( 1 )
{
    accept_socket_fd = accept( server_socket_fd, NULL, NULL );
    if ( accept_socket_fd == -1 ) assert( 0 );

    if ( accept_socket_fd > 0 )
    {
        server_algorithmic_dialog( accept_socket_fd );
    }
}

#define BUFFER_SIZE 1024

void server_algorithmic_dialog(
    int accept_socket_fd )
{
    char input_buffer[ BUFFER_SIZE ];
    char output_buffer[ BUFFER_SIZE ];

    read( accept_socket_fd, input_buffer, BUFFER_SIZE );

    if ( strcasecmp( input_buffer, "hola" ) == 0 )
        strcpy( output_buffer, "Hola Mundo" );
    else
    if ( strcasecmp( input_buffer, "ciao" ) == 0 )
        strcpy( output_buffer, "Ciao Mondo" );
    else
        strcpy( output_buffer, "Hello World" );

    write( accept_socket_fd, output_buffer, strlen( output_buffer ) + 1 );
}

Server close a connection

edit

The algorithmic dialog ends when either the algorithm concludes or read( accept_socket_fd ) returns < 1.[6] To close the connection, execute the close() system call:[6]

Snippet to close a connection:

int accept_socket_fd;

while ( 1 )
{
    accept_socket_fd = accept( server_socket_fd, NULL, NULL );
    if ( accept_socket_fd == -1 ) assert( 0 );

    if ( accept_socket_fd > 0 )
    {
        server_algorithmic_dialog( accept_socket_fd );
        close( accept_socket_fd );
    }
}

Snippet to illustrate the end of a dialog:

#define BUFFER_SIZE 1024

void server_algorithmic_dialog(
    int accept_socket_fd )
{
    char buffer[ BUFFER_SIZE ];
    int read_count;

    /* Omit algorithmic dialog */

    read_count = read( accept_socket_fd, buffer, BUFFER_SIZE );
    if ( read_count < 1 ) return;

    /* Omit algorithmic dialog */
}

Client instantiate and connect to /path/filename

edit

Computer program for the client to instantiate and connect a socket:[5]

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>

/* Must match the server's socket_address. */
char *socket_address = "/tmp/mysocket.sock";

void main( void )
{
    int client_socket_fd;
    struct sockaddr_un sockaddr_un = {0};
    int return_value;

    client_socket_fd = socket( AF_UNIX, SOCK_STREAM, 0 );
    if ( client_socket_fd == -1 ) assert( 0 );

    /* Construct the client address structure. */
    sockaddr_un.sun_family = AF_UNIX;
    strcpy( sockaddr_un.sun_path, socket_address );

    return_value =
       connect(
            client_socket_fd,
            (struct sockaddr *) &sockaddr_un,
            sizeof( struct sockaddr_un ) );

    /* If socket_address doesn't exist on the filesystem,   */
    /* or if the server's connection-request queue is full, */
    /* then connect() will fail.                            */
    if ( return_value == -1 ) assert( 0 );

    /* close( client_socket_fd ); <-- optional */

    exit( EXIT_SUCCESS );
}

Client I/O on a socket

edit

If connect() returns zero, the client can engage in an algorithmic dialog with the server. The client may send stream data via write( client_socket_fd ) and may receive stream data via read( client_socket_fd ).

Snippet to illustrate client I/O on a stream socket:

{
    /* Omit construction code */
    return_value =
        connect(
            client_socket_fd,
            (struct sockaddr *) &sockaddr_un,
            sizeof( struct sockaddr_un ) );

    if ( return_value == -1 ) assert( 0 );

    if ( return_value == 0 )
    {
        client_algorithmic_dialog( client_socket_fd );
    }

    /* close( client_socket_fd ); <-- optional */

    /* When the client process terminates,     */
    /* if the server attempts to read(),       */
    /* then read_count will be either 0 or -1. */
    /* This is a message for the server        */
    /* to execute close().                     */
    exit( EXIT_SUCCESS );
}

#define BUFFER_SIZE 1024

void client_algorithmic_dialog(
    int client_socket_fd )
{
    char buffer[ BUFFER_SIZE ];
    int read_count;

    strcpy( buffer, "hola" );
    write( client_socket_fd, buffer, strlen( buffer ) + 1 );
    read_count = read( client_socket_fd, buffer, BUFFER_SIZE );

    if ( read_count > 0 ) puts( buffer );
}

See also

edit
  • Pipeline (Unix) – Mechanism for inter-process communication using message passing
  • Netlink – Linux kernel interface for inter-process communication between processes

References

edit
  1. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1149. ISBN 978-1-59327-220-3. Sockets are a method of IPC that allow data to be exchanged between applications, either on the same host (computer) or on different hosts connected by a network.
  2. ^ a b c Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1150. ISBN 978-1-59327-220-3.
  3. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1150. ISBN 978-1-59327-220-3. The server binds its socket to a well-known address (name) so that clients can locate it.
  4. ^ a b Stevens, Richard W.; Fenner, Bill; Rudoff, Andrew M. (2004). Unix Network Programming (3rd ed.). Pearson Education. p. 14. ISBN 81-297-0710-1. Normally, the server process is put to sleep in the call to accept, waiting for a client connection to arrive and be accepted.
  5. ^ a b Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1169. ISBN 978-1-59327-220-3.
  6. ^ a b c d Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1159. ISBN 978-1-59327-220-3.
  7. ^ a b Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1166. ISBN 978-1-59327-220-3.
  8. ^ a b c Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1149. ISBN 978-1-59327-220-3.
  9. ^ a b Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1153. ISBN 978-1-59327-220-3.
  10. ^ a b c Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1151. ISBN 978-1-59327-220-3.
  11. ^ a b "Linux Programmer's Manual (unix - sockets for local interprocess communication)". 30 April 2018. Retrieved 22 February 2019.
  12. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1197. ISBN 978-1-59327-220-3.
  13. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1183. ISBN 978-1-59327-220-3.
  14. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1152. ISBN 978-1-59327-220-3.
  15. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1184. ISBN 978-1-59327-220-3.
  16. ^ a b Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1181. ISBN 978-1-59327-220-3.
  17. ^ a b Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1165. ISBN 978-1-59327-220-3.
  18. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1154. ISBN 978-1-59327-220-3.
  19. ^ a b c "Linux manual page for listen()".
  20. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1157. ISBN 978-1-59327-220-3.
  21. ^ "Linux manual page for accept()".
  22. ^ Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press. p. 1160. ISBN 978-1-59327-220-3.

Notes

edit
  1. ^ Alternatively, PF_UNIX or AF_LOCAL may be used.[11] The AF stands for "Address Family", and the PF stands for "Protocol Family".
  2. ^ A datagram socket should not be confused with a datagram packet used in the network layer.[13]
  3. ^ In UNIX, Everything is a file.