I'm not sure exactly what I did - if I reported
some fraudulent spam and they figured it out,
or (more likely) while working for a client,
I clogged a formmail.pl hole, or if they're
just convinced that my machine is a major ISP
with thousands of users, but thousands and
thousands of Russian and non Russian machines
are mounting a massive, coordinated attack to
probe me. That is, they're trying to
deliver mail to every user name anyone has
seen anywhere - at @illogics.org.
I'm not just talking about a lot of machines
probing me - I'm talking about a massive,
carefully coordinated probing. About 5 machines
go at it at once - as soon as one of them is
firewalled, another immediately takes its place.
After putting up with this for a while it
started to clog incoming mail, so I wrote a little
script to go through the postscript log
(as soon as I dumped sendmail, which only
logs useless information), finds repeated
attempts from hosts to deliver to non-existant
users, and writes out firewall rules I can
add to my firewall config file for NetBSD.
I would run this every few days when the
moon hit me, but it was completely ineffective -
the firewalled machines were immediately
replaced with new ones. Hundreds of machines
went by this way before I set cron to run
a version of the script on the hour that
automatically added the new rules to the
firewall. Now hundreds of thousands
of machines have been firewalled for this
reason. To make the grade, 3 delivery attempts
to non-existant users (and postfix species
non-existant users as a fatal error so there
is no excuse) in an hour gets you firewalled
on port 25. With hundreds of thousands of
firewall rules, the machine was spending most
of its time in the kernel processing firewall
rules before I set it to only do that on
connections to port 25 (which are, by
definition, new connections), and that helped
greatly. And there's no end in sight. I'll
have to start running the script more often -
perhaps daemonize it and make it follow the
log and ban things in real time. At this
point, I'm curious how many zombie machines
these Russians have. Anyway, for your
reading pleasure, here's the portion of
my firewall that's automatically generated
from this anti-user-probing script. Let me know if you want
the script and I'll post a copy (too lazy).
Oh - I almost forgot the kicker - I'm running
fingerd, so anyone could easily finger the
machine and see who the users are.
-scott
I know you said you have hundreds of thousands, but the sample spamreport.txt only has a little over 20K. Even still, doing just a cursory glance you would be a lot better off not blocking individual IPs.
#!/usr/bin/perl
use strict;
use warnings;
my $spam = $ARGV[0] || 'spamreport.txt';
open (SPAM, '<', $spam) or die "Unable to open $spam for reading : $!";
map { $_->[4] }
sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1]
||
$a->[2] <=> $b->[2] || $a->[1] <=> $b->[2] }
map { [ m|(\d+)\.(\d+)\.(\d+)\.(\d+)/|, $_ ] } <SPAM>;
I think if you used better masks, perhaps by checking http://www.arin.net/ to see who owns the block and how big it is, you would be able to reduce the total number of rules quite dramatically.
#!/usr/bin/perl
use bigint;
my $count = 0;
while(my $ip = ) {
chomp $ip;
my $mask;
($ip, $mask) = $ip =~ m{block in quick from ([0-9.]+)/([0-9]+) to any};
next unless $ip and $mask;
my $numhosts = 1
Wasn't that fun? Okay, the hits on RIPE are regexes against HTML so I'll not post that thank me very much to avoid corrupting the young.
And here's the thing that runs from cron and looks at postfix's log:
#!/usr/bin/perl
use IO::Handle;
use POSIX;
# process all bans for the recently passed out (10 minutes ago)
my $timestamp = strftime "%b %e %H:", localtime(time() - 600); # eg, "Oct 7 02:"
my $recv;
my $count = 0;
open my $spam, ' 10000*80) {
# if longer than about 10,000 "lines", seek relative the end
print "Seeking relative the end - long file\n";
seek $spam, - 10000*80, 2;
;
}
while(my $log = ) {
last if $timestamp eq substr $log, 0, length $timestamp;
}
while(my $log = ) {
# Aug 30 11:09:35 straylightpostfix/smtpd[17179]: NOQUEUE: reject: RCPT from mail.marvelconsultants.com[66.94.77.249]: 450 : Recipient address rejected: User unknown in local recipient table; from= to= proto=ESMTP helo=
next unless $log =~ m/User unknown in local recipient table/;
(my $rechost, my $recip) = $log =~ m/reject: RCPT from ([a-z0-9.-]+)\[([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\]/i;
next unless $recip;
$spammers{$recip}->[0]++;
$spammers{$recip}->[1] ||= $rechost;
$spammers{$recip}->[2] ||= $recip;
$count++;
}
print("processed $count messages\n");
my @spammers = sort { $b->[0] $a->[0] } values %spammers;
foreach my $spammer (@spammers) {
last if $spammer->[0] print(sprintf "block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2], $spammer->[0], $spammer->[1]);
$pipe->flush;
$pipe->close;
printf("block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2
], $spammer->[0], $spammer->[1]);
}
That's pretty damn banal.
The story itself is a lot more interesting.
Here are my insipid full firewall rules which don't include most of the things firewall rules usually do... so back to that! Running the first script on the firewall rules, I get these figures:
total: 2662472 banished hosts
that happens to be about 1 in every 1613 hosts that are banished
I wouldn't mind knowing how to firewall off all of Rusian if you have any thoughts on abusing ARIN ;)
Re:clarification and thanks
scrottie on 2004-11-14T16:43:58
Okay, the code tags didn't do what I wanted... let's try pre!
#!/usr/bin/perl
use bigint;
my $count = 0;
while(my $ip = ) {
chomp $ip; my $mask;
($ip, $mask) = $ip =~ m{block in (?:proto tcp )?quick from ([0-9.]+)/([0-9]+) to any} or print "can't parse: $ip\n";
next unless $ip and $mask;
my $numhosts = 1 And then...
#!/usr/bin/perl
use IO::Handle; use POSIX;
# process all bans for the recently passed out (10 minutes ago)
my $timestamp = strftime "%b %e %H:", localtime(time() - 600); # eg, "Oct 7 02:"
my $recv; my $count = 0;
open my $spam, ' 10000*80) { # if longer than about 10,000 "lines", seek relative the end print "Seeking relative the end - long file\n"; seek $spam, - 10000*80, 2; ; }
while(my $log = ) { last if $timestamp eq substr $log, 0, length $timestamp; }
while(my $log = ) { # Aug 30 11:09:35 straylight postfix/smtpd[17179]: NOQUEUE: reject: RCPT frommail.marvelconsultants.com[66.94.77.249]: 450 : Recipient address rejected: User unknown in local recipient table; from=to= proto=ESMTP helo=
next unless $log =~ m/User unknown in local recipient table/;
(my $rechost, my $recip) = $log =~ m/reject: RCPT from ([a-z0-9.-]+)\[([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\]/i; next unless $recip;
$spammers{$recip}->[0]++; $spammers{$recip}->[1] ||= $rechost; $spammers{$recip}->[2] ||= $recip;
$count++;
}
print("processed $count messages\n");
my @spammers = sort { $b->[0] $a->[0] } values %spammers; foreach my $spammer (@spammers) { last if $spammer->[0] print(sprintf "block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n" , $spammer->[2], $spammer->[0], $spammer->[1]); $pipe->flush; $pipe->close; printf("block in quick proto tcp from %s/32 to any port = 25 # %d: %s\n", $spammer->[2], $spammer->[0], $spammer->[1]); }
Also also, I should mention I have an ulterior motive: I have a concept I'm playing with for pre-emptive blacklisting based on the idea of aggregates, sort of like Google sets, or "people who bought that also bought these". I posted, for Phoenix Perl Mongers as part of a presentation when in conjunction to spam filtering a list of the top spammers, and a Google brings in a lot of hits - a suprising number of hits - for this document. Sysadmins search for their own domains reportedly; people fix open relays then seek to have them removed from blacklists; etc. I had another idea - what if I Google for the IP of a known spam source and then suck down every hit, tally up occurances of other IPs in all of those documents, and then assume that other IPs that tend to appear in proximity to the spammers are also spammers? Viola, instantly distributed mail-abuse.org style black hole! People need only run some stats, however they see fit, on spammers spamming their domains, and other people (or the same people) can suck this down, process it, and use it. I'll play with quotations the same way - Google for an exact quote to see what other people who like that quote have in their quote collections. So, to this end, I just wanted to generate some juice for my updated, automatically generated list >=)
-scottRe:clarification and thanks
bart on 2004-11-14T23:28:04
Okay, the code tags didn't do what I wanted... let's try pre!Yuck! Don't people have a "preview" button any more? Or read the help text under the textarea, when entering their post?
Try "<ecode>", it'll preserve your formatting.