HTML-only spam

dws on 2002-11-27T05:37:06

Looking around for low-overhead ways of catching spam (no point in going heavyweight if lightweight will do), I came across the following simple procmail trick:

:0
^Content-type:.*html
htmlspam

A quick check showed that nobody was sending html-only email that I cared about, so I gave it a try. It caught 50 spams in 2 days, or about 25% of my daily volume.

Next into the bit-bucket, base64-encoded text/plain entities. A good excuse to play with MIME::Parser and MIME::Entity.

Now I should have been cleaning my study...


SpamAssassin tests for this

merlyn on 2002-11-27T13:50:44

You can assign an arbitrary "likelyhood of spam" value, and have it spam-bucket anything that comes like that.

And despite you not having a false positive on such email, with the volume of email I get I've found that I can't just delete those: an occasional legitimate piece of email gets sent that way. So I'm forced to watch that folder about once every other day or so to see if some person is asking me something important.

That's the problem with spam filters for me: I can't afford to ignore legitimate mail, because I run a small business.

Re:SpamAssassin tests for this

dws on 2002-11-27T17:07:36

I checked a 5+ year email archive, and found no HTML-only messages that weren't spam. But you're right, it could happen that a legitimate email arrives without a TEXT fork.

Are you seeing particular MUAs that send HTML-only?