Intruder Alert - Tracking down a rogue connection
The hosts I administer look through their logfiles each hour, looking for things that are out of the ordinary, and mailing them to me. This is where my uncanny ability to remember leap-seconds comes from; it's not because I actually care about things, but because I'll get an e-mail saying that a host suddenly found itself a whole second out of sync with some incredibly accurate clock somewhere. Lucky me.
Most things don't end up in my log digests, because most things are boring. The things I do see are things that I either care about, or have never seen before.
A few days ago I got an e-mail that one machine was trying to contact a particular address, 172.16.45.35. What made this noteworthy is that address is unroutable. It's a reserved, private address space. It doesn't go anywhere on the Internet, and it's not used by us internally, so there should be no reason to try and contact it. The connection indicated it would have been an outgoing web request, and since I was busy working on other things, I assumed that some other fool had set up their system incorrectly, and thought nothing of it. People leave references to their own internal sites in documents all the time.
A few days later I got another e-mail, same result. And then another, the next day, and another. Each time I looked a little closer. About the same time each day, a few attempts to contact this address, and then nothing.
Today, this bothered me. What if we're seeing these packets because there's something running on this machine that shouldn't be? So I go to my proxy logs, and do a search for the address. Nothing matches.
Hmm, that's odd. Let's see what's in our name-server cache, since the address is probably the result of a name lookup. kill -INT on your named will let you see its memory cache, a great trick to remember. Nothing in here, either, but it's now been hours since I got the mail, so the record may well have expired.
What's odd about this connection is that it seems to happen around lunchtime, but not every day, and not always exactly the same time, and sometimes it misses days, so I don't really know if or when I'll ever see it again. Rather than trying to futilely trying to find it minutes after it occurs, I figure that I'll set something running to catch it in the act:
lsof -r1 +c0 -ni@172.16.45.35 | grep --line-buffered -v ======= | tee 172.16.45.35-watch
Every second I'll look for anything involving that address using lsof, remove the boring separators, and chuck it into a file and STDOUT. The --line-buffered option is important so that grep doesn't delay the flow of data.
While I've got that running, I whip up a bit of perl to watch the output, and grab fingerprints of the process when we spot it; most of the contents of the relevant entry in /proc will do nicely.
I leave my tools to run in a screen session (which will preserve them even if my terminal goes away), make dinner, have an extremely nice home-brew beer, and watch some TV. In this case, House MD.
As is common in many episodes of House, the patient is sick with a disease that should have been obvious, since they test for that particular disease in every single episode (expect this one). Not long afterwards, it dawns upon me what my problem may be.
I go to the proxy logs, and I look up all the addresses of all the sites accessed today:
perl -pe'my @f = split(/\s+/,$_); $_ = $f[6]; s{^http://}{}; s{[:/].*}{}; $_ .= "\n"' access.log | sort | uniq | perl -MIO::Handle -MNet::DNS -e'my $dns = Net::DNS::Resolver->new; STDOUT->autoflush(1); while (<>) { chomp; print "== $_ ==\n"; my $q = $dns->search($_); if ($q) { foreach my $rr ($q->answer) { print $rr->address,"\n" if $rr->type eq "A" } } else { warn $dns->errorstring; } }' | tee /tmp/squid-dns
Sure enough, after a few moments my answer pops out! The address proof.smh.com.au resolves to 172.16.45.35. SMH is the Sydney Morning Herald, and it looks like some of their pages refer to an internal machine, which my reporting system is dutifully reporting as odd. Sure enough, all the dates and times in the squid logfiles match up nicely with the packet intercept reports I was sent.
The reason it wasn't showing up when I first searched is that squid only records IP addresses when it needs to make a direct connection. In this case, it tried to make the connection and failed, but rather than giving up it then proceeds to try and hand the problem off to an upstream cache, and that's what gets reporting in the logfile.
This is good for me, since it means that the odd connections really are what I thought they would be, and not a sign of unwanted activity on one of my machines. I'm doubly glad because it would be especially embarrassing to find a rootkit that's so badly written that it tries connecting to a completely unroutable address.