I mumbled something about Email::Folder hating me, today, but I was too busy to explain, and I promised that I'd write down my annoyances later. I'd love to fix these problems soon, but for now it's easier to just grumble about them, and it will make me feel better.
To print all threads in a maildir, very naively, I might write something like this:
my $maildir = Email::Folder->new('./Maildir/'); while (my $email = $maildir->next_message) { my $subject = $email->header('subject'); next if $subject =~ /^re:/i; print "$subject\n"; }
Great! There are all the non-reply subjects, more or less. They're not in order, though, and I want to see them in order. Email::Folder's iterator is not ordered, and there is no uniform way to request that it be ordered. To get messages in order, we'll need to get them all and then sort. That's not such a bad obstacle, really.
my $maildir = Email::Folder->new('./Maildir/'); # the sort isn't interesting my @emails = sort { ... } $maildir->message; for my $email (@emails) { my $subject = $email->header('subject'); next if $subject =~ /^re:/i; print "$subject\n"; }
Now, the problem here is that we've now loaded every email at once. They're loaded as Email::Simple objects, which means the entire message content is loaded into memory at once, so if I had a huge maildir, I now have a huge perl process.
Email::Folder provides a bless_message
method, which is used to create the
Email::Simple objects. Each time the Email::Folder object's next_message
method is called, the Email::Folder::Reader (subclassed for the storage medium)
gets the message content from the underlying storage and returns it as a
string. Email::Folder then passes it to bless_message
, which by default
passes it to Email::Simple. It's being passed around as a string, meaning that
we're copying the full text of each (possibly huge) message a few times before
returning the object and throwing away the raw string.
It would be easy to make the Maildir reader return filehandles, but
bless_message
also needs to be replaced to handle them. Then the problem is
that if you try to do this:
my $folder = Email::Folder::MessagesFromFH->new('mbox');
...you will be hosed, because you will get a Email::Folder::Mbox, which reads
messages out as strings. You need to either write a bless_message
that
handles strings and filehandles, or you need to override new
to prevent
anything that won't use the right reader.
All I wanted to do was implement a cooler version of frm
!
Hopefully I will wake up fresh in the morning and feel energized to actually do something constructive, rather than just whine.
Re:Dude...
rjbs on 2007-11-28T02:14:44
My example was *radically* simplified for the point of demonstrating the headache, it was not the entire program I wanted to produce.Re:Dude...
educated_foo on 2007-11-28T02:34:08
Sorry, I was just guessing from what you posted and your description of a "better 'frm'." But still, so long as you only care about headers, it seems like Email::Foo would be more trouble than it's worth.Re:Dude...
rjbs on 2007-11-28T03:22:47
You are wrong.
I don't want to match header-like content in bodies, or in the headers of subparts. I need to match wrapped headers. I will need to decode MIME-encoded headers. I will need to parse RFC822 date fields.
Email isn't simple.Re:Dude...
dug on 2007-11-28T04:29:24
Email isn't simple.That said, I just today had to install some code that used your helpful modules in order to make it more simple. Thanks a whole bunch. You make email easier.
-- DouglasRe:Dude...
rjbs on 2007-11-28T12:19:39
Thanks! While I am more a maintainer than an author on many or most of the email modules under my name on the CPAN, knowing that they save people work is a nice motivator to keep doing my own work on them.Re:Dude...
educated_foo on 2007-11-29T04:46:11
Just trying to make a helpful suggestion, not question your intelligence or piss in your oatmeal. Ah, well...Re:Dude...
rjbs on 2007-11-29T11:41:14
I'm sorry, I don't mean to come off crabby, but "Can't You Just?" is a common refrain around email programming, and basically always leads to horrible problems due to the mistaken belief that email is just some headers and maybe a body. I have grown bitter and grumpy whenever someone says to use something non-email-specific to do email stuff.
Maybe this is my brain's way of telling me that I'm done with email and should move on to something that's always fun, like the web.Re:Dude...
educated_foo on 2007-11-30T02:42:00
My bad. I've spent a bit of time with email (but obviously not as much as you have), which probably made me overconfident. Plus, for personal projects like this, I have a strong bias toward 80% or 90% solutions.Anyways, enjoy that always-fun web. (X|HT)ML seems just the thing to make email look simple.