MIT Spam Conference

johnseq on 2004-02-05T01:27:10

I wasn't able to attend the Spam Conference over at MIT, but I did catch the webcast. I found most of it quite interesting. It was great to hear from Yahoo, Microsoft and Brightmail about the scaling issues (and opportunities) that come with having billions of spam attacks/day. They're all beginning to leverage Cloudmark's collaborative filtering to some degree, but all hit the same issue that what people consider spam varies quite a bit. The Microsoftie also noted that 60% of spam offers require a domestic presense (i.e. financial services) These cannot be off-shored and are therefore vulnerable to legal remedies. The rest is (software, porn, nigerian 411, etc) and will probably go up as the laws are applied.

The first lawyer to present, last year's "Hi I'm Jon Praed, and I sue spammers", came across with:

  • Spam laws _could be_ good start
  • Identity, jurisdiction required to pursue legal cases
  • CAN-SPAM is good because spammers have long argued that what they were doing was not against the law. That's no longer true (in the U.S. at least).
  • Important provision of CAN-SPAM attaches liability to businesses profiting from spam. He thinks this is greatly under-appreciated.

    The Tar Proxy talk wasn't all that interesting, but clearly making them pay (in CPU time at least) was very gratifying to the author.

    The Brightmail speaker mentioned that they're implementing Paul Graham's filters that fight back, sort of. He didn't use that phrase, but his company is following links in email to see what is on the other side, and using factors from that to determine spaminess. They can leverage this inspection over a huge user base, so they don't risk slashdotting innocent joe-job victims. One challenge to identifying just URLs for spammers just by domain is the number of open redirect scripts web-wide (rd.yahoo.com being the most often abused) to disguise the ultimate destination of a spam offer. I winced when I realized that I've probably contributed two or three to the pool spammers can use. Also, I had a thought that 'Boy it'd be great if they shared the list of spam URLs' and shortly after he mentioned that they were considering some way of sharing. Eric Kidd spoke on sender-pays/e-postage real world experience with the camram project. Although folks like me ( and Matts) often dismiss the sender pays idea because of either joe-jobs launched from virus-compromised computers, or the fact that you typically have to upgrade the whole internet at once to make them work, none of these concerns was news to Eric and his presentation did not sidestep these issues. His points were:
  • Sender-pays works great if you redesign the entire internet. Obviously not practical.
  • Hybrid sender-pays works well, with filtering s/w accomodating the metric for postage
  • What can be used as a stamp? This is a big issue that will likely evolve.
  • Money stamps don't work (centralization, theft, regulation)
  • Hash collision is very popular now, but memory-based problems are probably more appropriate than anything CPU based because of Moore's law, spammers building custom h/w, etc.
  • Whitelist someone who sends you stamped mail so future correspondence can be verified w/ signatures. "Strangers cost, friends fly free."

    The best presentation was from Peter Kay' of Titan Key - http://titankey.com/mit/ . The technology was an elegant combination of simple concepts, but I liked it most because the speaker ( a chicago-born Hawaiian transplant ) was by far the most dynamic and convincing. He proselytized much more than he spoke, and the audience really bought it.

    He described his company's product called KeyMail. Instead of disposal email addresses (accountimostlyignore@hotmail.com), you have programmable addresses -- ones that auto whitelist in various ways (based on domain, exact email address etc). So you give out address 'johnseq-UNIQUEID@mymail.com' to each person, company or mailing list that you want to correspond with. One typical rule would be that the email or domain that first responds to the email address is whitelisted for it. All subsequent use of that email would be put in a challenge response queue.

    One key differentiator for KeyMail is that it's implemented at the SMTP/MTA level. The whitelisting rules implemented are simple enough that you can reject spam before it is delivered, saving a lot of CPU, bandwidth and disk space in the process.

    Peter mentioned that there's always a need for a general purpose email address (like generic sales addresses on a corporate web site), so filters don't really go away. But he brought up his Outlook inbox and said "Look at this. No filters. No spam. For the last year". I think that's a less challenging result if you're committed to C/R, but the neat thing is that he mostly wasn't. C/R is just used as the filter of last resort, and rarely at that.

    I see that user retraining issues ( having to pre-generate an email address for folks you meet on the street seems a drag ) and ISP lock-in are the two biggest problems with KeyMail. For the latter, there are a couple solutions. The rules seem simple enough that they could be as portable as mail filtering rules - http://www.cyrusoft.com/sieve/ ). Also, the IETF is working on making challenge/response interactions automated, so that you never feel that particular pain. Of course, if you had interoperable C/R, KeyMail's raison d' etre might largely disappear.

    [Aside: I would love it if my email forwarding service pobox.com implemented this. I'm pretty locked into them anyway, and it doesn't bind me to an ISP.]

    In summary, from the keymail talk and the spam conference in general I think two themes came through: any spam solution needs to painless interoperate with the situation we have today ( duh ), and no single solution will really solve the problem. The 'drug cocktail' metaphor was used more than once, and I think appropriate on more than one level.


    Sharing spam URLs...

    bart on 2004-02-05T10:45:36

    Also, I had a thought that 'Boy it'd be great if they shared the list of spam URLs' and shortly after he mentioned that they were considering some way of sharing.
    Hmm... I thought that was implied by somehting you quoted earlier:
    They can leverage this inspection over a huge user base, so they don't risk slashdotting innocent joe-job victims.
    "Spreading over a user base" implies some form of central repository to me.

    Re:Sharing spam URLs...

    johnseq on 2004-02-05T14:07:37

    It's a business issue, not a technical one. The service they're selling is valuable to a great degree because of this central repository, and they're probably trying to figure out how to share this aspect of it without losing it's value.