The Russian Spam Mafia Wants Me Probed

scrottie on 2004-11-14T07:30:54

I'm not sure exactly what I did - if I reported some fraudulent spam and they figured it out, or (more likely) while working for a client, I clogged a formmail.pl hole, or if they're just convinced that my machine is a major ISP with thousands of users, but thousands and thousands of Russian and non Russian machines are mounting a massive, coordinated attack to probe me. That is, they're trying to deliver mail to every user name anyone has seen anywhere - at @illogics.org.

I'm not just talking about a lot of machines probing me - I'm talking about a massive, carefully coordinated probing. About 5 machines go at it at once - as soon as one of them is firewalled, another immediately takes its place. After putting up with this for a while it started to clog incoming mail, so I wrote a little script to go through the postscript log (as soon as I dumped sendmail, which only logs useless information), finds repeated attempts from hosts to deliver to non-existant users, and writes out firewall rules I can add to my firewall config file for NetBSD. I would run this every few days when the moon hit me, but it was completely ineffective - the firewalled machines were immediately replaced with new ones. Hundreds of machines went by this way before I set cron to run a version of the script on the hour that automatically added the new rules to the firewall. Now hundreds of thousands of machines have been firewalled for this reason. To make the grade, 3 delivery attempts to non-existant users (and postfix species non-existant users as a fatal error so there is no excuse) in an hour gets you firewalled on port 25. With hundreds of thousands of firewall rules, the machine was spending most of its time in the kernel processing firewall rules before I set it to only do that on connections to port 25 (which are, by definition, new connections), and that helped greatly. And there's no end in sight. I'll have to start running the script more often - perhaps daemonize it and make it follow the log and ban things in real time. At this point, I'm curious how many zombie machines these Russians have. Anyway, for your reading pleasure, here's the portion of my firewall that's automatically generated from this anti-user-probing script. Let me know if you want the script and I'll post a copy (too lazy). Oh - I almost forgot the kicker - I'm running fingerd, so anyone could easily finger the machine and see who the users are.

-scott


Use smarter masks....

Limbic Region on 2004-11-14T14:33:02

other than /32

I know you said you have hundreds of thousands, but the sample spamreport.txt only has a little over 20K. Even still, doing just a cursory glance you would be a lot better off not blocking individual IPs.

#!/usr/bin/perl
use strict;
use warnings;

my $spam = $ARGV[0] || 'spamreport.txt';
open (SPAM, '<', $spam) or die "Unable to open $spam for reading : $!";

print
map { $_->[4] }
sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1]
                           ||
       $a->[2] <=> $b->[2] || $a->[1] <=> $b->[2] }
map  { [ m|(\d+)\.(\d+)\.(\d+)\.(\d+)/|, $_ ] } <SPAM>;

I think if you used better masks, perhaps by checking http://www.arin.net/ to see who owns the block and how big it is, you would be able to reduce the total number of rules quite dramatically.

clarification and thanks

scrottie on 2004-11-14T16:34:27

Ah, yes, I should have been more careful with my numbers. And language. Hundreds of thousands of IPs have been banned but not by that script and thus not in that output; they were banned by another script that (guess what) hit ARIN and blocked the entire netblock. I don't want to do this too hastily (in other words, I don't wnat to do this automatically) so the automatic script is doing /32's. Since there is some interest in passing from someone (even if that interest is just stimulating a clarification), here's code (whee!).

This tallies how many people, and how much of the 'net, I've firewalled:
#!/usr/bin/perl

use bigint;

my $count = 0;

while(my $ip = ) {

chomp $ip; my $mask;

($ip, $mask) = $ip =~ m{block in quick from ([0-9.]+)/([0-9]+) to any};

next unless $ip and $mask;

my $numhosts = 1
Wasn't that fun? Okay, the hits on RIPE are regexes against HTML so I'll not post that thank me very much to avoid corrupting the young. And here's the thing that runs from cron and looks at postfix's log:


#!/usr/bin/perl

use IO::Handle; use POSIX;

# process all bans for the recently passed out (10 minutes ago)

my $timestamp = strftime "%b %e %H:", localtime(time() - 600); # eg, "Oct 7 02:"

my $recv; my $count = 0;

open my $spam, ' 10000*80) { # if longer than about 10,000 "lines", seek relative the end print "Seeking relative the end - long file\n"; seek $spam, - 10000*80, 2; ; }

while(my $log = ) { last if $timestamp eq substr $log, 0, length $timestamp; }

while(my $log = ) { # Aug 30 11:09:35 straylightpostfix/smtpd[17179]: NOQUEUE: reject: RCPT from mail.marvelconsultants.com[66.94.77.249]: 450 : Recipient address rejected: User unknown in local recipient table; from= to= proto=ESMTP helo=

next unless $log =~ m/User unknown in local recipient table/;

(my $rechost, my $recip) = $log =~ m/reject: RCPT from ([a-z0-9.-]+)\[([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\]/i; next unless $recip;

$spammers{$recip}->[0]++; $spammers{$recip}->[1] ||= $rechost; $spammers{$recip}->[2] ||= $recip;

$count++;

}

print("processed $count messages\n");

my @spammers = sort { $b->[0] $a->[0] } values %spammers; foreach my $spammer (@spammers) { last if $spammer->[0] print(sprintf "block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2], $spammer->[0], $spammer->[1]); $pipe->flush; $pipe->close; printf("block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2 ], $spammer->[0], $spammer->[1]); }
That's pretty damn banal. The story itself is a lot more interesting. Here are my insipid full firewall rules which don't include most of the things firewall rules usually do... so back to that! Running the first script on the firewall rules, I get these figures:


total: 2662472 banished hosts that happens to be about 1 in every 1613 hosts that are banished



I wouldn't mind knowing how to firewall off all of Rusian if you have any thoughts on abusing ARIN ;)

Re:clarification and thanks

scrottie on 2004-11-14T16:43:58

Okay, the code tags didn't do what I wanted... let's try pre!

#!/usr/bin/perl

use bigint;

my $count = 0;

while(my $ip = ) {

chomp $ip; my $mask;

($ip, $mask) = $ip =~ m{block in (?:proto tcp )?quick from ([0-9.]+)/([0-9]+) to any} or print "can't parse: $ip\n";

next unless $ip and $mask;

my $numhosts = 1 And then...

#!/usr/bin/perl

use IO::Handle; use POSIX;

# process all bans for the recently passed out (10 minutes ago)

my $timestamp = strftime "%b %e %H:", localtime(time() - 600); # eg, "Oct 7 02:"

my $recv; my $count = 0;

open my $spam, ' 10000*80) { # if longer than about 10,000 "lines", seek relative the end print "Seeking relative the end - long file\n"; seek $spam, - 10000*80, 2; ; }

while(my $log = ) { last if $timestamp eq substr $log, 0, length $timestamp; }

while(my $log = ) { # Aug 30 11:09:35 straylight postfix/smtpd[17179]: NOQUEUE: reject: RCPT frommail.marvelconsultants.com[66.94.77.249]: 450 : Recipient address rejected: User unknown in local recipient table; from=to= proto=ESMTP helo=

next unless $log =~ m/User unknown in local recipient table/;

(my $rechost, my $recip) = $log =~ m/reject: RCPT from ([a-z0-9.-]+)\[([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\]/i; next unless $recip;

$spammers{$recip}->[0]++; $spammers{$recip}->[1] ||= $rechost; $spammers{$recip}->[2] ||= $recip;

$count++;

}

print("processed $count messages\n");

my @spammers = sort { $b->[0] $a->[0] } values %spammers; foreach my $spammer (@spammers) { last if $spammer->[0] print(sprintf "block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n" , $spammer->[2], $spammer->[0], $spammer->[1]); $pipe->flush; $pipe->close; printf("block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2], $spammer->[0], $spammer->[1]); }

Also also, I should mention I have an ulterior motive: I have a concept I'm playing with for pre-emptive blacklisting based on the idea of aggregates, sort of like Google sets, or "people who bought that also bought these". I posted, for Phoenix Perl Mongers as part of a presentation when in conjunction to spam filtering a list of the top spammers, and a Google brings in a lot of hits - a suprising number of hits - for this document. Sysadmins search for their own domains reportedly; people fix open relays then seek to have them removed from blacklists; etc. I had another idea - what if I Google for the IP of a known spam source and then suck down every hit, tally up occurances of other IPs in all of those documents, and then assume that other IPs that tend to appear in proximity to the spammers are also spammers? Viola, instantly distributed mail-abuse.org style black hole! People need only run some stats, however they see fit, on spammers spamming their domains, and other people (or the same people) can suck this down, process it, and use it. I'll play with quotations the same way - Google for an exact quote to see what other people who like that quote have in their quote collections. So, to this end, I just wanted to generate some juice for my updated, automatically generated list >=)

-scott

Re:clarification and thanks

bart on 2004-11-14T23:28:04

Okay, the code tags didn't do what I wanted... let's try pre!

Yuck! Don't people have a "preview" button any more? Or read the help text under the textarea, when entering their post?

Try "<ecode>", it'll preserve your formatting.