Socket programming in Perl: Part I

Using the Socket.pm module, one may create and implement powerful tools known as sockets. The Socket API allows communication between hosts over a network or between processes run locally. Sockets are defined differently based on their functionality and their implementation.

Sockets primarily communicate in either of two "types": streams are bi-directional connections that function like a pipeline; two endpoints are established and data is communicated singularly and sequentially. Datagram sockets, however, provide fast, broadcast functionality with very little overhead. A datagram is often described as being completely self-contained, meaning that it has no relationship to any other packet of data that is communicated at any other time. Although all IP traffic is actually communicated by datagram, it is important to distinguish that the stream type of connection implements the TCP protocol as a method for organizing and securing uninterruptible and verifiable communication.

In addition to defining a socket by protocol, one must define a socket by its domain. One may define a socket by either an internet domain or a Unix domain.

As they require very little overhead, Unix domain sockets are typically used for local, inter-process communications. Unix domain sockets need not observe ACK's, encapsulation, or flow control. They use the unix file system as a the address name space, so they are called by invoking file descriptors. They inherit permissions from the local file system, as one might imagine, but require extra work to migrate to internet environments.

Internet (IP) domain sockets are typically implemented as such; communication over the Internet on top of the IP protocol. While Internet sockets may be served locally, no effort to bypass IP protocol is made. Internet domain sockets are defined by a hostname and port either locally as a server socket or remotely as a client socket.

Lastly sockets are defined by the protocol used for communication. The socket
protocol will be determined by the connection type. Protocols like tcp and udp are identified by numbers that your operating system uses. One may use the getprotobyname() function, interpolating a string that identifies the desired protocol.

    socket(SERVER, PF\_INET, SOCK\_DGRAM, getprotobyname('udp'));

The socket() function implemented in the example above is just one of the built-in functions in perl for socket implementation and creation. There are a few such functions that warrant mention. We will begin with the built-in functions that are called by a simple stream server socket in an internet domain.

socket(), as you've seen, creates a socket. The arguments define the socket by filehandle, domain, type, and protocol, respectively. The filehandle may be used as an argument in either of two ways:

connect(CLIENT, $port\_addr)
    or die "$!";

while(<CLIENT>){
    dosomething($\_);}

listen() prepares the socket for connections from other sockets, and accept() receives each connection. The listen() function is invoked by calling the function with the socket filehandle and the maximum number of queued connections requested that have yet to be accepted. Any connection that connect()'s to this socket after the value has been met will be answered with the error, "connection refused". Your system will limit the maximum number of listen() connections. The module allows one to call the value in the SOMAXCONN constant. I typically call listen() with SOMAXCONN denoted as the maximum listen() connections.

listen(SERVER, SOMAXCONN)
    or die "$!";

accept()[i] is called with two arguments and returns the IP address and port number of the accepted client. [i]accept() is often used as a while conditional:

while(<STDIN>){
    DoSomething($\_);}

one will often express...

while(accept(CLIENT, SERVER)){
    DoSomething($\_);}

or, as we established the return value of accept()...

while($remote\_client=accept(CLIENT, SERVER)){
    ## stores client's address and begins iteration

I mentioned that the return value of the accept() function is the client's IP address and port number. An internet socket must have a name before it may bind, connect, or send to another socket. This name is packed by the function sockaddr_in() and is comprised of both an IP address and a port number. Furthermore, the IP address is packed by the inet_aton() function; inet meaning the internet domain, and aton meaning ascii to numbers. While one must call separate functions to pack and unpack an IP address, the sockaddr_in() function resolves contextually; I don't think I need to preach to any perl users about the magic of contextual functionality.

use Socket;

$packed\_ip = inet\_aton("208.75.227.82");
$socket\_name = sockaddr\_in($port\_num, $packed\_ip);

and

($port\_num, $packed\_ip) = sockaddr\_in($socket\_name);
$unpacked\_ip = inet\_ntoa($packed\_ip);

As an aside, a domain name will resolve just as easily as an IP address when called as an argument to the inet_aton() function.

Returning to the case of the simple server we created, we would be able to break down the client's name like this:

while($remote\_client=accept(CLIENT, SERVER)){
    ($port\_num, $packed\_ip) = sockaddr\_in($remote\_client);
    $unpacked\_ip = inet\_ntoa($packed\_ip);
    ### any other expressions

}
close(SERVER);

One step we've overlooked thus far is bind(). bind() is best described as a function that gives a socket a name locally. bind() is called with the arguments that define the filehandle and the socket name (respectively). Remember, we have to first pack the ip address using inet_aton(), and then we will pack the port number and packed IP address into a name, thusly:

use Socket;

socket(SERVER, PF\_INET, SOCK\_STREAM, getprotobyname('tcp'));
    $packed\_ip = inet\_aton('cyberarmy.net');
    $port\_num = 80;
$socket\_name=sockaddr\_in($port\_num, $packed\_ip);
bind(SERVER, $socket\_name);
    or die "$!";

listen(SERVER, SOMAXCONN)
    or die "$!";

while(accept(CLIENT, SERVER)){
    DoSomething($\_);}

As you can see the typical stream server will be created using socket(), given a local name with bind(), line up a connection queue using listen(), and then will loop in an accept() block.

Creating datagram server sockets is so much simpler using the IO::Socket module, so I won't even bother getting into until part 2 of this series. One would have a hard time finding someone who still used the raw Socket.pm module for datagram server socket creation and manipulation, so there's very little documentation anywhere. I will, however, give a thorough examination of the datagram server socket using the IO::Socket module in part 2.

A client socket is a little simpler to create. A client socket is created in the same way no matter the type. However, stream and datagram sockets are implemented in different ways.

A tcp/stream client socket is created using socket() and then connected to a tcp/stream server socket using connect(), which can be called with the socket filehandle and the packed address and port number of the remote server socket:

use Socket;

socket(CLIENT, PF\_INET, SOCK\_STREAM, getprotobyname('tcp'));

    $packed\_remote\_ip = inet\_aton('cyberarmy.net');
    $port\_num = 23;
$remote\_name = sockaddr\_in($port\_num, $packed\_remote\_ip);

    connect(CLIENT, $remote\_name)
        or die "$!";

        print CLIENT "Tell Eliza I said, hi.\\n";

    close(CLIENT);

A datagram socket, however, requires no connection to the server. A datagram client socket will typically only be created using socket() and employed using send() and recv(). send() i called with arguments declaring the socket filehandle, the message buffer, any flags, and the packed name of the remote address and port. The last argument in the list may be omitted if you connect() the datagram client socket first.

socket(CLIENT, PF\_INET, SOCK\_DGRAM, getprotobyname('udp'));

$remote\_name = sockaddr\_in($port\_num, $packed\_remote\_ip);
    send(CLIENT, $msg, 0, $remote\_name)
        or die "$!";

then one may catch the return value of recv().

$remote\_name = recv(CLIENT, $msg, $maxlen, 0)
    or die "$!";

You should now have a pretty good grasp of things as they operate within the Socket.pm module of perl. We discussed the nature of socket programming, and we created and detailed the functionality of stream server sockets, client sockets, and udp client sockets. We will discuss socket programming in perl using the IO::Socket module when we meet again.