Network Socket Programming
CS 641 Lecture, Dr. Lawlor
I claim "message passing" programming is a very handy way to write
parallel software. Message passing's "parallel weirdness" only
happens at message sends and receives. Shared memory, by
contrast, encounters parallel weirdness during memory accesses, which
can happen anywhere.
Just about the only important network interface today is TCP/IP. suprisingly there's
basically
only one major programming interface used for talking on a TCP/IP
network, and that's "Berkeley sockets", the original UNIX interface as
implemented by the good folks at UC Berekeley.
The Berkeley sockets interface is implemented in:
- All flavors of UNIX, including Linux, Mac OS X, Solaris, all BSD flavors, etc.
- Windows 95 and higher, as "winsock".
Brian Hall, or "Beej", maintains the definitive readable introduction to Berkeley sockets programming, Beej's Guide to Network Programming. He's got a zillion examples and a readable style. Go there.
Bare Berkeley sockets are pretty tricky and ugly, especially for
creating connections. The problem is Berkeley sockets support all
sorts of other protocols, addressing modes, and other features like "raw
sockets" (that have serious security implications!). But when I write
TCP code, I find it a lot easier to use my own little library of public
domain utility routines called socket.h and socket.cpp. It's way too nasty
to write portable Berkeley code for basic TCP, so I'll give examples
using my library.
My library uses a few funny datatypes:
- SOCKET: datatype for a "socket": one end of a network connection between two machines. This is actually just an int.
- skt_ip_t: datatype for an IP address. It's just 4 bytes.
To connect to a server "serverName" at TCP port 80, and send some data to it, you'd call:
- skt_ip_t ip=skt_lookup_ip(serverName); to look up the
server's IP address. In general, you can pass a DNS name, but
NetRun only supports dotted-decimal IPs.
- SOCKET s=skt_connect(ip,80,2); to connect to that
server. "80" is the TCP port number. "2" is the timeout
in seconds.
- skt_sendN(s,"hello",5);
to send the 5-byte string "hello" to the other side. You can now
repeatedly send and receive data with the other side.
- skt_close(s); to close the socket afterwards.
Here's an example in NetRun:
#include "osl/socket.h" /* <- Dr. Lawlor's funky networking library */
#include "osl/socket.cpp"
int foo(void) {
skt_ip_t ip=skt_lookup_ip("127.0.0.1");
unsigned int port=80;
SOCKET s=skt_connect(ip,port,2);
skt_sendN(s,"hello",5);
skt_close(s);
return 0;
}
(executable NetRun link)
Easy, right? The same program is a great deal longer in pure
Berkeley sockets, since you've got to deal with error handling (and not
all errors are fatal!), a long and complicated address setup process,
etc. This same code works in Windows, too.
On NetRun, you can also "Download this
file as a .tar archive" to get the socket.h and socket.cpp files.
Network Server
A network server waits for connections from clients. The calls you make are:
- unsigned int port=8888; /* listen on this TCP/IP port (or use 0 to have the OS pick a port) */
- SERVER_SOCKET srv=skt_server(&port); /* lay claim to that port number */
- SERVER s=skt_accept(srv,0,0); /* wait until a client connects to our port */
- skt_sendN and skt_recvN data to and from the client.
- skt_close(s); /* stop talking to that client */
- skt_close(srv); /* give up our claim on server port */
Again, between accept and close you can send and receive data any way you like. Your sends make
data arrive at client receive calls, and your receives grab data from
the client's sends. It's easy to screw up a network server by
trying to receive data that isn't going to arrive!
You usually repeat steps 3-5 again and again to handle all the clients
that try to connect. Many servers are designed as an infinite
loop--they keep handling client requests until the machine is turned
off. One thread can even have accepted connections from several
different clients, and be sending and receiving data from them at the
same time.
High-performance servers, like the Apache
web server, often will call fork() either before step 3 (called
"preforking", where several processes wait in accept) or before step 4
(one process accepts, then splits off a child process to handle each
client).
Only root can open server ports numbered less than 1024 on most UNIX
systems. Two programs can't listen on the same server port--the
second program will get a socket error when he tries skt_server.
Here's an example network server that serves exactly one client and then exits.
#include "osl/socket.h"
#include "osl/socket.cpp" /* include body for easy linking */
int foo(void)
{
unsigned int port=8888;
SERVER_SOCKET serv=skt_server(&port);
std::cout<<"Waiting for connections on port "
<<port<<"\n";
skt_ip_t client_ip; unsigned int client_port;
SOCKET s=skt_accept(serv,&client_ip,&client_port);
std::cout<<"Connection from "
<<skt_print_ip(client_ip)
<<":"<<client_port<<"!\n";
/* Receive some data from the client */
std::string buf(3,'?');
skt_recvN(s,(char *)&buf[0],3);
std::cout<<"Client sent data '"<<buf<<"'\n";
/* Send some data back to the client */
skt_sendN(s,"gdaymate\n",9);
skt_close(s);
std::cout<<"Closed socket to client\n";
skt_close(serv);
return 0;
}
(executable NetRun link)
In NetRun, the server will just hang while waiting for connections by
default. If you visit the URL https://lawlor.cs.uaf.edu:8888/
while the program is running, you should see the gdaymate message!
Here's the corresponding client. Note the receives in the server have to be sent by the client, and vice versa.
#include "osl/socket.h"
#include "osl/socket.cpp" /* include body for easy linking */
int foo(void)
{
skt_ip_t ip=skt_lookup_ip("127.0.0.1");
unsigned int port=8888;
SOCKET s=skt_connect(ip,port,2);
/* Send some data to the server */
skt_sendN(s,"dUd",3);
/* Receive some data from the client */
std::string buf(8,'?');
skt_recvN(s,(char *)&buf[0],8);
std::cout<<"Server sent data '"<<buf<<"'\n";
skt_close(s);
std::cout<<"Closed socket to server\n";
return 0;
}
You can also download this server and client program (directory, .zip, .tar.gz), and run them on your own machine.
It's easier to write network clients, and it's more common.
Network servers are more dangerous--anybody could connect to your
server, and send anything, so servers are usually trickier to get right.