Job queues

kappa on 2010-03-11T22:45:46

I need a persistent reliable distributed job queue with a good Perl interface.

TheSchwartz would be great but it looks dead. Brad Fitzpatrick is in the valhalla of Google, last release was in 2008 and cpantesters report more failures than successes.

gearman is not reliable. It won't retry jobs on another worker nor workers can return failed jobs.

Resque is too ruby-specific.

beanstalkd looks almost fine, but it does not distinguish between job types. You cannot easily register one worker for sending emails and another worker for converting images.

Should I start looking into various Java-based monstrosities like ActiveMQ?


Nope

miyagawa on 2010-03-12T01:07:11

TheSchwartz isn't dead, it's just so stable. We use heavily on our production sites such as typepad.com every day. We own the maintenance bit on that module so we can ship if there are critical bugs etc. which we haven't heard any yet.

Gearman could be made reliable if you use Brian's C Gearman server, while you can still use the Gearman perl binding http://gearman.org/

Re:Nope

chorny on 2010-03-12T09:03:54

TheSchwartz: CPAN Testers PASS (2) FAIL (136)

Patch was available in 2008: https://rt.cpan.org/Public/Bug/Display.html?id=38570

Re:Nope

miyagawa on 2010-03-12T09:14:51

Yes, that's a shame and I will try to upload the new version soon.

But as you can see from the patches, that's not a code fix, just a bad test. Not something considered as dead / unused.

Re:Nope

miyagawa on 2010-03-15T21:19:25

It was fixed on the SVN repo more than an year ago but no CPAN release has been made, which was now done a few minutes ago.

Re:Nope

kappa on 2010-03-12T10:05:30

Thanks for information!

C Gearman server does not retry jobs if a worker returns failure.

But it looks like we can trick it to retry by dropping connection from worker.

Re:Nope

miyagawa on 2010-03-12T13:09:36

> C Gearman server does not retry jobs if a worker returns failure.

Because returning a failure is not an error :)

RabbitMQ

notbenh on 2010-03-12T22:20:21

I have no experience but I've considered it a few times. If I remember correctly it speaks HTTP/JSON like CouchDB so accessing via perl (or anything) is easy.

Beanstalk

gurunandan on 2010-03-13T18:55:08

If you need separate workers, do look at "tubes" in Beanstalk. I have several workers each watching a separate tube. For example email jobs go into the email tube and image conversion jobs into the imagecon tube.

Another solution is to have a general purpose worker started and pass the script that you need to run as a serialized data structure. The worker can then fork and run whatever script is passed to it - email sender or an image converter.