Putting MIME::Lite on a diet

samtregar on 2005-12-03T22:35:58

There's nothing quite like some justified optimization to get the heart pumping. I've been working on a system which needs to produce multipart MIME messages very quickly. It's using MIME::Lite, which, it turns out, isn't actually all that lite. After solving the other bottlenecks in my code I was left with around 30% of my processing time locked-up in MIME::Lite's code.

My first stop was MIME::Fast. It certainly is fast, but unfortunately it also has some large memory leaks. Running at full speed it was losing around 5MB per second! I spent a few hours confirming that the leaks are somewhere in the huge XS codebase of MIME::Fast and not in the underlying gmime library. I gave up and filed a bug with the author.

That left me back with MIME::Lite, so I decided to see if I could speed it up. After a few hours of work I've come up with a patch which offers a 50% speedup for my use-case (creating two-part messages from parts in memory). For the curious, here's my work-log:

  • Starting work => MIME::Lite is building messages at 750/s
  • Use direct hash access for Attrs => 900/s
  • Change how Attrs are stored, moved sub-attrs into a separate structure. This avoids the {''} access for the vastly more common case of attribs without sub-attribs. => 980/s
  • Did some work optimizing fields() which is a hot method. => 1046/s
  • Tightened-up fields_as_string a bit. Might be worth making the pretty-printing optional if the spec doesn't require it. => 1057/s
  • Started testing with more realistic part sizes, new base => 910/s
  • inlined the popular routine known_types() => 930/s
  • Played with speeding up IO_* to little effect. Removing the wrap() indirection entirely would no-doubt help but it's a big project.

I sent the patch to the MIME::Lite maintainer, so hopefully this code will be available to all someday soon.

-sam


I'd be interested in that patch

TeeJay on 2005-12-05T11:08:41

Hi Sam,

I'd be interested in that patch as at my previous job we had a system that sent thousands of big multipart MIME emails at a time and that could improve things greatly.

I'd also possibly use it in my new job too.

Re:I'd be interested in that patch

samtregar on 2005-12-05T19:21:34

Be my guest: http://sam.tregar.com/mime_liter.diff

-sam

Maintaining MIME::Lite

bart on 2005-12-07T06:32:50

As plans are that I'm taking over maintenance of MIME::Lite, I'm taking a special interest in what you have done. I've received a more recent developer version of MIME::Lite from the current/old maintainer some time ago. I'll have to see how these things merge.

First of all, as I'd like to duplicate your benchmarks, I wonder how exactly you measured things? What exactly did you test?

Second, I have my doubts on how $sub_attrs->{'content-disposition'}{'filename'} would ever be faster than $sub_attrs->{'content-disposition.filename'} — but maybe there's more to it than I see at first look. I'd like to understand your rationale for that change.

Re:Maintaining MIME::Lite

samtregar on 2005-12-07T07:41:51

1) I'll send you the benchmarks I sent to the current maintainer via. I used Benchmark.pm, of course.

2) The code needs to be able to loop through all the sub-attrs for a given attr in fields(). That would be rather complicated with the scheme you suggest, although it's possible it would be faster (and there's only one way to find out!). However, this isn't a change I made: the old code looked up every single attribute like $attr->{$attr}{$sub} where $sub was '' for the "main" attribute. My change removed this for "main" attributes which are by far the most common case and left the sub-attribute storage scheme as-is, albeit moved to a separate structure.

-sam

Re:Maintaining MIME::Lite

bart on 2005-12-07T21:37:26

RE #2: You're right, judging by your explanation, the original looks very ugly indeed. I hadn't caught that in the casual reading of your patch.