Non-blocking IMAP

Matts on 2006-08-30T17:42:39

In thinking about things to ship with AxKit2 I thought it would be cool to have a webmail that doesn't suck written in Perl. You know - drag and drop with AJAX, HTML editor, decent speed, etc.

Of course all the perl IMAP libraries are blocking - they don't drop in well to a non-blocking framework. So I set out to write a non-blocking library for IMAP.

Turns out it was easier than I hoped. Net::IMAP::Simple (which is probably "Simple" because it doesn't support much of IMAP, but I'll probably figure that out in a few weeks) is mostly really well written, and was fairly easy to extend. I had to re-write quite a few of the methods, though some of the changes can probably be incorporated into the base class as they make for a nice improvement to the design.

Result is I can now do this:

use Net::IMAP::Simple::NB;

Danga::Socket->AddTimer(0, sub {
    # Create the object
    my $imap = Net::IMAP::Simple::NB->new('felony:143') ||
       die "Unable to connect to IMAP: $Net::IMAP::Simple::errstr\n";
    
    print "IMAP: $imap\n";
    
    $imap->login('matt','xxxxxxx', sub {
        my $login_ok = shift;
        if ($login_ok) {
            print "Login OK\n";
            $imap->select("INBOX", sub {
                my $nm = shift;
                print "Got $nm Messages\n";
                my $i = 1;
                my $sub;
                $sub = sub {
                    my $headers = shift;
                    print grep { /^Subject:/ } @$headers;
                    $i++;
                    $imap->top($i, $sub) unless $i == $nm;
                };
                $imap->top($i, $sub);
            });
        }
    });
});

Danga::Socket->EventLoop;

Where the "sub" callbacks are called when the Net::IMAP::Simple methods return with their responses, and in the meantime the event loop is not blocked.

One of the problems with building IMAP based webmail is you don't want to be re-establishing connections to the IMAP server on every hit. AxKit2 can get around that problem by just keeping the connection open and hooking the request up to the right connection using a cookie/session-id.

Lots of work still to do, but now it's just a small matter of programming.

closure-callback model rocks

jmason on 2006-08-30T18:04:46

that use of sub { } closures to handle event-driven programming rocks so hard -- I used it a lot in SA 3.2 in the Mail::SpamAssassin::AsyncLoop class, too.

I'd hate to have to go back to the old model of event-driven programming, with objects to hold all state between events, etc.

Re:closure-callback model rocks

Matts on 2006-08-31T05:31:22
Closures are such an incredibly powerful programming paradigm, especially for this kind of app - it's hard to believe that some languages get by without them.

What's really cool is that closures in perl just do what you expect them to do. The hard thing to do is explain closures and make them look like they might not be what's expected - because when you look at the syntax you just think "yep, that's normal".

Leaky?

runrig on 2006-08-31T23:24:03

Don't those kinds of closures leak (anonymous recursive subs, that is), unless maybe you use maybe Scalar::Util::weaken or one of those combinator -type thingies (or some Devel::* module I think or something else)? Or do you really even care in this instance? :-)

Re:Leaky?

Matts on 2006-09-01T16:09:08
I don't think they leak - there were some bugs with eval blocks and anonymous subs, but I think all those leaks have been closed now.

This design is used a lot by IO::AIO (which is used in some people's production apps) and thus if it leaked we would find out about it darn fast.

Re:Leaky?

runrig on 2006-09-01T18:37:57
I thought it happened whenever you had the my $sub; $sub = sub { ...$sub... }; construct, and that whenever that code got called, the reference count for $sub would never go to zero. Is what you're doing any different than this (leaks in 5.8), or did this sort of thing recently get fixed?:

my $foo = sub { my $sub; $sub = sub { $sub; }; }; my $i; print "Start: "; <STDIN>; while (++$i) { my $f = $foo->(); unless ( $i % 100_000 ) { print "$i: "; <STDIN>; } }

Re:Leaky?

Matts on 2006-09-01T18:51:55
That does indeed seem to leak, but it's fixed by adding weaken($sub) inside the anonymous sub, so that's no big deal to me. Thanks for the heads up though.

Re:Leaky?

runrig on 2006-09-01T23:57:00
I know weaken() was one of the options, but I'm trying to become a functional weenie, so I'm starting to like the combinator approach (this example ripped and tweaked from the Moose source):

sub U { my $f = shift; sub { $f->($f, @_) }; } sub Y { my $f = shift; U(sub { my $h = shift; sub { $f->(U($h)->())->(@_) } })->(); } # Now we need a recursive anonymous function my $fact = Y(sub { my $f = shift; sub { my $n = shift; return 1 if $n < 2; return $n * $f->($n-1); } }); print $fact->(5), "\n";

Re:Leaky?

Matts on 2006-09-02T01:03:42
Yeah, it's just a shame that code is so cryptic. No wonder perl gets a bad rap (though I guess it's copied from some lamda calculus stuff). You'd think they could use more than single character variables and function names for it all.

Re:Leaky?

runrig on 2006-09-02T07:41:08

I agree that the Y() is cryptic (and you can't blame perl too much), but the U() is not that hard to understand, and if they're tucked away in a library (as they are in Moose::Autobox albeit in an OO form), then you don't have to look at them, and they're pretty simple to use. In other words, "don't look behind the curtain, just use the Y-combinator" :-)

If you have freedom in what args your recursive sub accepts (which I assume you don't in the above instance), then you can let the first arg be the sub itself, and just use the U() by itself:

my $fact = U( sub { my ($f,$n) = @_; return 1 if $n < 2; return $n*$f->($f,$n-1); } ); print $fact->(5),"\n";

Or my favorite:

sub Y { goto $_[0] } my $fact = sub { my ($f,$n) = @_; return 1 if $n < 2; return $n*$f->($f,$n-1); }; print Y($fact, 5),"\n";

Y combinator

Aristotle on 2006-09-03T23:34:22

It’s the construct that’s cryptic, not the Perl code. It’s no easier to understand in Lisp, although the core concept is actually very simple. The basic idea is that you make an anonymous function recursive by making it take a reference to itself as a parameter, so that it can call that to recurse on itself.

The actual implementation requires doing a series of code sit-ups that makes the fully built combinator very hard to follow. I think I’ll have to rewrite it for myself in Perl rather than trying to follow along in one of the Scheme-based explanations, in order to actually grok it fully.

Re:Y combinator

Matts on 2006-09-03T23:51:46
I think decently named params would make it understandable.

Re:Y combinator

Aristotle on 2006-09-04T16:25:05

I would usually agree but in this case I don’t see how. Using $func or $closure in place of $f wouldn’t do anything to clarify any of what’s going on, and I can’t think of any good name of $h. The functions are no easier to name more sensibly. It would help the legibility of the Y combinator to break it down a bit and assign subexpressions to suitably named variables, but the difference wouldn’t be anything to write home about.

These are simply near-unreadable constructs. And it doesn’t really matter, because:

Once they’re written, you can use them without having to remember exactly how they work. You just memorise how to appropriately curry your recursive closures, which is pretty trivial, and just use it.

If you want to make sure you’re not just cargo-culting, you can pull up an explanation of how to build up the combinator and derive it yourself.

In practice these things just don’t need to be readable once you get one working; you find out what the Y combinator is for and how it works, then memorise how to use it and stop sweating the details.

Re:Y combinator

Matts on 2006-09-04T16:39:10
So here's where I'm coming from: The Y combinator is useless in the code I showed, because $sub isn't called explicitly - it's a callback. So having the Y combinator re-written to be more readable, with better variable names, and comments, might allow me to write a version specifically for my code that doesn't require the use of weaken().
Re:Y combinator

chromatic on 2006-09-04T18:15:00

In practice these things just don’t need to be readable once you get one working;

That sums up my frustrations with higher order mathematics pretty effectively. I strongly suspect that the Haskell compiler rejects identifiers with more than one vowel or over eight characters, but I can't prove it, for example.

Re:Y combinator

Aristotle on 2006-09-04T19:44:57

I suppose. I can’t read the Y combinator either, y’know. I can derive it when I start from scratch, though, and for as long as that model is in my head, I can reason about it.

I just can’t build it inductively from reading the source/expression. I have to deduce it every time from scratch.

It may be a matter of practice.

Re:Leaky?

Matts on 2006-09-04T19:45:25
So here's a version that doesn't leak, that uses this approach, and isn't so darn cryptic:

# wrap an anonymous function into a function that takes $func as the first # parameter so we can make use of it from within the call. sub wrap_func { my $func = shift; return sub { $func->($func, @_) }; } ... # sometime later... $imap->message_info($i, wrap_func(sub { my $sub = shift; my $flags = shift; my $headers = shift; print "Date: $flags->{DATE}\n"; print "Flags: " . join(', ', keys %$flags) . "\n"; print grep { /^Subject:/ } @$headers; $i++; $imap->message_info($i, wrap_func($sub)) unless $i == $nm; }));

Re:Leaky?

Aristotle on 2006-09-04T21:26:44

Well yes, you get away without the Y combinator here. wrap_func is a rather nondescript name, though, don’t you agree? A better one might be something like curry_self or apply_self_maker or some such, although it’s probably not obvious to someone without basic knowledge of functional programming why these names are in fact better. All of them sound similarly meaningless, though the better ones might even be a little scary. And to someone with functional programming knowledge, U would be just as descriptive anyway; in fact, to such a programmer your choice of name is much worse than using U. So it’s not clear to me that you can gain actual clarity by “better” naming.

Note that you get away with the U combinator here, which is self-explanatory anyway. Try doing the same with the Y combinator and you’ll have a lot less luck.

If it’s not clear yet, I happen to think that understanding the Y combinator (what it’s for and how it’s derived) is an important part of any good programmer’s education. For a demonstration of why, just take note that your recursive function is broken. To fix it, change the last line:

$imap->message_info($i, $sub) unless $i == $nm; # no wrap_func

Applying the U combinator to itself does not yield the U combinator. So if you call wrap_function on the return of wrap_function, you will get an ever deeper onion of closures that apply the previous wrapper to itself plus their arguments, accumulating wrappers going down the layers, until finally the last wrapper applies the original function to itself plus all the wrappers plus the passed parameters. Ie. after four applications you’d have a wrapped wrapped wrapped wrapper equivalent to this:

sub { my $func = shift; my $f_inner1 = sub { $func->($func, @_) }; my $f_inner2 = sub { $f_inner1->($f_inner1, @_) }; my $f_inner3 = sub { $f_inner2->($f_inner2, @_) }; return sub { $f_inner3->($f_inner3, @_) }; }

That’s obviously not going to work.

Re:Leaky?

Matts on 2006-09-04T21:34:31
Your "fix" doesn't work.

wrap_func() took a function, and returned a new function that took the given $func as the first argument. Therefore, in my callback $sub is not "wrapped", so needs to be re-wrapped.

Re:Leaky?

Aristotle on 2006-09-04T22:01:22

Oh d’uh. I got confused between properties of the Y and U combinators.

Re:Leaky?

Matts on 2006-09-04T22:07:54
I'm shocked. How could you confuse the two!

Re:Leaky?

Aristotle on 2006-09-04T22:23:33

Heh. I’d read about the Y combinator before but I only started studying it in anger yesterday because I realized I wouldn’t actually understand it on any deeper level if I didn’t sit down to derive it from first principles for myself.

My grasp is strengthening by the hour, but you can see I’m not quite there yet. :-)