TCP/IP Networking

CS 321 2007 Lecture, Dr. Lawlor

Background

A network is just a way of getting information from one machine to another.  This is a simple idea, which means that everybody in the world has tried to implement it from scratch--there are way too many networks out there, although thankfully the weirder ones are dying off.

You always start with a way to get bytes from one machine to the other.  For example, you can use the serial port, parallel port, or a network card to send and receive bytes.  Bytes actually physically sent between machines are said to be "on the wire", even if they're sent over a fiber optic cable or microwave radio link!

Just sending bytes back and forth, however, is almost never enough.  You immediately find you need:
There are quite a few different ways to handle these issues.  The standard way to do this is to wrap all data in little "packets". A packet consists of a header, some data, and possibly a trailer.  The "header" indicates who the message is for, which piece of the message it is, and other housekeeping.  The trailer usually includes a checksum for error detection. 

The International Standards Organization (ISO) defined a very complicated layered model for networking called the Open Systems Interconnect (OSI) model.  Almost nobody implements the thing, but the conceptual model is pretty popular.  The layers of the ISO OSI model are:
People have built lots and lots of different networking interfaces.  Totally unique networking interfaces I've used include:
Today, "the network" means TCP/IP, the standard protocol spoken on the internet.  TCP/IP is really at least three different protocols:
Both TCP and UDP allow many different pieces of software to run on a single machine at once.  This means an IP address alone isn't enough to specify who you're talking to--the IP address identifies the machine, and the "TCP port number" identifies the program running on that machine.  TCP port numbers are 16-bit unsigned integers, so there are 65,536 possible port numbers.  Zero is not a valid port number, and the low-numbered ports (below 1024) are often reserved for "well-known services", which usually require special privileges to open.

For the next week, we'll focus on TCP, since it's by far the most popular protocol for doing anything on the internet.  For example, the following all use TCP:

Writing TCP Code

One can imagine lots of programming interfaces for talking to the network, and there are in fact lots of totally different interfaces for talking via NetBIOS, AppleTalk, etc.  But suprisingly there's basically only one major programming interface used for talking on a TCP/IP network, and that's "Berkeley sockets", the original UNIX interface as implemented by the good folks at UC Berekeley.

The Berkeley sockets interface is implemented in:
Brian Hall, or "Beej", maintains the definitive readable introduction to Berkeley sockets programming, Beej's Guide to Network Programming.  He's got a zillion examples and a readable style.  Go there.

Bare Berkeley sockets are pretty tricky and ugly, especially for creating connections.  The problem is Berkeley sockets support all sorts of other protocols, addressing modes, and other features like "raw sockets" (that have serious security implications!).  But when I write TCP code, I find it a lot easier to use my own little library of public domain utility routines called "socket.h".  It's way too nasty to write portable Berkeley code for basic TCP, so I'll give examples using my library. 

My library uses a few funny datatypes:
To connect to a server "serverName" at TCP port 80, and send some data to it, you'd call:
Here's an example in NetRun:
#include "osl/socket.h" /* <- Dr. Lawlor's funky networking library */
#include "osl/socket.cpp"

int foo(void) {
skt_ip_t ip=skt_lookup_ip("127.0.0.1");
unsigned int port=80;
SOCKET s=skt_connect(ip,port,2);
skt_sendN(s,"hello",5);
skt_close(s);
return 0;
}
(executable NetRun link)

Easy, right?  The same program is a great deal longer in pure Berkeley sockets, since you've got to deal with error handling (and not all errors are fatal!), a long and complicated address setup process, etc.

This same code works in Windows, too.  On NetRun, "Download this file as a .tar archive" to get the socket.h and socket.cpp files, or download them here.