A standard logging API

jonswar on 2007-09-06T21:52:13

It seems as if every CPAN module has its own way of logging debug information and error conditions. For example:

  • LWP - activate by use'ing LWP::Debug; outputs to STDERR
  • DBI - activate by calling DBI->trace(); outputs to STDERR or a file
  • Rose::DB - activate by setting various $Debug package variables; outputs to STDERR
  • Encode::* - activate by modifying various DEBUG subroutines to return 1; outputs using warn()
  • Apache::* - activate by setting the Apache log level and restarting; outputs to the Apache logs
In addition, there must be CPAN modules that have interesting things to say but choose not to log at all, because they don't want to invent another logging mechanism or become dependent on an existing one.

This situation is pretty much the opposite of what I want when developing a large application. I want a single way to turn logging on and off, and to control where logs get sent, for all of the modules I'm using.

This being Perl, there are many fine logging frameworks available: Log::Log4perl, Log::Dispatch, Log::Handler, Log::Agent, Log::Trivial, etc. So why do CPAN modules eschew the use of these and invent their own mechanisms that are almost guaranteed to be less powerful?

  • The very existence of so many logging modules means that there is no one standard that a CPAN author would feel comfortable binding their users to. As usual, TMTOWTDI is a double-edged sword.

  • A logging framework can be a significant dependency for a module to have, easily dwarfing the size of the module itself. For small modules that want to minimize dependencies, depending on Log4perl (for example) is a non-starter.
A Common Log API

One thing to notice is that while the logging frameworks all differ in their configuration and activation API, and the set of features they support, the API to log messages is generally quite simple. At its core it consists of

  • A set of valid log levels, e.g. debug, info, warn, error, fatal
  • Methods to log a message at a particular level, e.g. $log->debug()
  • Methods to determine if a particular level is activated, e.g. $log->is_debug()
I expect most CPAN modules would happily stick to this API, and let the application worry about configuring what's getting logged and where it's going. Therefore...

Proposed Module: Log::Any

I propose a small module called Log::Any that provides this API, with no dependencies and no logging implementation of its own. Log::Any would be designed to be linked by the main application to an existing logging framework.

A CPAN module would use it like this:

    package Foo;
    use Log::Any;
    my $log = Log::Any->get_logger(category => __PACKAGE__);

    $log->error("an error occurred");

    $log->debug("arguments are: " . Dumper(\@_))
        if $log->is_debug();
By default, methods like $log->debug would be no-ops, and methods like $log->is_debug() would return false.

As a convenient shorthand, you can use

    package Foo;
    use Log::Any qw($log);
to create the logger, which is equivalent to the first example except that $log is (necessarily) a package-scoped rather than lexical variable.

How does an application activate logging? The low-level way is to call Log::Any->set_logger_factory (better name pending) with a single argument: a subroutine that takes a log category and returns a logger object implementing the standard logging API above. The log category is typically the class doing the logging, and it may be ignored.

For example, to link with Log::Log4perl:

    use Log::Any;
    use Log::Log4perl;

    Log::Log4perl->init("log.conf");
    Log::Any->set_logger_factory
       (sub { Log::Log4perl->get_logger(@_) });
To link with Log::Dispatch, with all categories going to the screen:

    use Log::Any;
    use Log::Dispatch;

    my $dispatcher = Log::Dispatch::Screen->new(...);
    Log::Any->set_logger_factory(sub { $dispatcher });
To link with Log::Dispatch, with different categories going to different dispatchers:

    use Log::Any;
    use Log::Dispatch;

    my $dispatcher_screen = Log::Dispatch::Screen->new(...);
    my $dispatcher_file   = Log::Dispatch::File->new(...);

    sub choose_dispatcher {
        my $category = shift;
        $category =~ /DBI|LWP/ ? $dispatcher_file : $dispatcher_screen;
    }
    Log::Any->set_logger_factory(\&choose_dispatcher);
This API is a little awkward for the average user. One solution is for logging frameworks themselves to provide more convenient mixins, e.g.:

   use Log::Dispatch;   # this also defines Log::Any::use_log_dispatch
   my $d = Log::Dispatch::File->new(...);
   Log::Any->use_log_dispatch($d);  # calls set_logger_factory for you

   use Log::Log4perl;   # this also defines Log::Any::use_log4perl
   Log::Any->use_log4perl();        # calls set_logger_factory for you
set_logger_factory would be implemented so as to take effect on all existing as well as future loggers. Any $log objects already created inside modules will automatically be switched when set_logger_factory is called. (i.e. $log will probably be a thin proxy object.) This means that Log::Any need not be initialized by the time it is used in CPAN modules, and it allows set_logger_factory to be called more than once per application.

Promoting Use

For Log::Any to be useful, a substantial number of modules - especially major modules - would have to adopt its use. Fortunately, with its minimal footprint and standalone nature, authors should not find Log::Any a difficult dependency to add. Existing logging mechanisms, such as LWP::Debug and $DBI::tfh, could easily be converted to write *both* to their existing output streams and to Log::Any. This would preserve backward compatibility for existing applications, but allow new applications to benefit from more powerful logging. I would be willing to submit such patches to major module authors to get things going.

Feedback welcome. Thanks!


I'm not a fan of the syntax...

Alias on 2007-09-07T01:06:33

I'm not sure why, but I think the API lacks a certain elegance.

Personally I'd have done the name'ology differently (although tbh I'm not sure what I'd have used instead).

Re:I'm not a fan of the syntax...

Alias on 2007-09-07T01:11:26

Upon re-reading I think two things I'm not a fan of are the "Any" suffix, and the use of package-named class-level storage.

What if I want to make an logger for a single object...

Re:I'm not a fan of the syntax...

jonswar on 2007-09-07T04:23:27

As far as the package-named class-level storage: the "use Log::Any qw($log)" syntax is just a convenient shorthand that, imho, will be good enough for some modules. If you are uncomfortable with this, you can use the more standard syntax

        my $log = Log::Any->get_logger(category => '...');

and you would use this to create a lexically scoped logger or a logger contained in an object.

As far as the "Any" name - I agree, it isn't ideal, but I haven't been able to think of a better alternative. The problem is that most log frameworks already have a generic sounding name - e.g. Log::Dispatch - so somehow I have to name this in a way that distinguishes it from yet another logging framework.

Re:I'm not a fan of the syntax...

Alias on 2007-09-07T08:50:28

I don't mean for $log, I mean for "category => '...'.

That implies you are keeping inside Log::Any a hash of some sort with the class as the key and an object or log class as the value. That's what I meant by the class-level storage.

As for ::Any, I notice that Log::Abstract seems to be unused at the moment, if you do a CPAN search for "Abstract" and have a read through some of the modules there, there is a precent for this general type of thing.

Re:I'm not a fan of the syntax...

jonswar on 2007-09-07T18:02:34

> As for ::Any, I notice that Log::Abstract seems to be unused at the moment, if you do a
> CPAN search for "Abstract" and have a read through some of the modules there, there is a
> precent for this general type of thing.

Thanks, I like that name much better! "Any" has always seemed rather...indecisive. I'm going to retroactively change this in the journal entry.

> I don't mean for $log, I mean for "category => '...'.
>
> That implies you are keeping inside Log::Any a hash of some sort with the class as the key
> and an object or log class as the value. That's what I meant by the class-level storage.

So the question is whether get_logger() automatically memoizes based on the category name, or whether this is the responsibility of the underlying logging package and/or the code registered in set_logger().

I had leaned towards memoizing, because most logging packages will not handle this for you (log4perl being the exception) and it seems like the expected behavior.

I wonder what the point would be of creating multiple Log::Abstract loggers with the same category anyway, given that category is the only data upon which the application has to decide what logging package and settings will be used. That is, given two calls

      my $logger1 = Log::Abstract->get_logger("foo");
      my $logger2 = Log::Abstract->get_logger("foo");

wouldn't these pretty much have to return the same logger, unless the routine passed to set_logger() was changing its behavior over time?

Remember that set_logger can be called more than once in an application to effectively change all existing abstract loggers. So there is little reason for the routine passed to set_logger() to be dynamic across anything other than category.

I hope that makes sense.

Jon

Some nits to pick

autarch on 2007-09-07T05:14:49

Generally, I think this is a good idea, but I have some nits to pick.

First, the set_logger() method is not actually setting a logger at all, which is confusing. It's setting a callback that returns a logger, so maybe it should be set_logger_factory() or something like that. This is particularly problematic since there's a get_logger() method, which doesn't return the thing passed to set_logger()!

I'm not sure that passing in a sub ref is the best API either. My gut feeling is that "the average programmer" isn't really comfortable with subroutine references. OTOH, I see the need for being able to pass in a log making strategy, as opposed to a single log object.

Maybe there could be a super-simple API where you pass in a log object, and a more complex API where you pass in a sub ref or logger factory class.

I also wonder how this package-level globalness will work. What if I have a CPAN module that wants to set the Log::Any logger, and I want to use that module in my app but set a different logger? Is it last set wins?

Imagine the problems this could cause under mod_perl. Maybe the answer for mod_perl is that your app needs to set the logger on every request, not just once at startup time.

Re:Some nits to pick

fireartist on 2007-09-07T08:07:12

I got the impression that all CPAN modules should use the logging interface, but not set any logger themselves.
So the only place the logger would be set would be in your 'top level' application, or framework (catalyst, jifty, etc).

Re:Some nits to pick

jonswar on 2007-09-07T18:14:28

That's exactly right. Since the setting of Log::Abstract is (by design) global across the application, it would be considered bad behavior for an arbitrary CPAN module to muck with it, in the same way it would be bad behavior to unset $^W, or clear @ARGV, or read from STDIN - unless that was part of the proscribed behavior of the module.

Re:Some nits to pick

jonswar on 2007-09-07T18:34:56

> First, the set_logger() method is not actually setting a logger at all, which is
> confusing. It's setting a callback that returns a logger, so maybe it should be
> set_logger_factory() or something like that. This is particularly problematic since
> there's a get_logger() method, which doesn't return the thing passed to set_logger()!

You're right, set_logger() should be renamed, though I'm not sure I can bear to use "factory". :)

>
> I'm not sure that passing in a sub ref is the best API either. My gut feeling is that "the
> average programmer" isn't really comfortable with subroutine references. OTOH, I see the
> need for being able to pass in a log making strategy, as opposed to a single log object.
>
> Maybe there could be a super-simple API where you pass in a log object, and a more complex
> API where you pass in a sub ref or logger factory class.

Unfortunately, a super-simple API wouldn't work for log4perl.

Perhaps set_logger_factory() or whatever it is ultimately called should be the low-level interface, and logging frameworks can provide higher-level mixins to make things easy for the average user. e.g.

      use Log::Dispatch; # this also defines Log::Abstract::use_log_dispatch
      my $dispatcher = Log::Dispatch::File->new(...);
      Log::Abstract->use_log_dispatch($dispatcher);

      use Log::Log4perl; # this also defines Log::Abstract::use_log4perl
      Log::Abstract->use_log4perl(); # no argument needed

>
> I also wonder how this package-level globalness will work. What if I have a CPAN module
> that wants to set the Log::Any logger, and I want to use that module in my app but set a
> different logger? Is it last set wins?
>
> Imagine the problems this could cause under mod_perl. Maybe the answer for mod_perl is
> that your app needs to set the logger on every request, not just once at startup time.

I just think we have to trust modules not to muck with global state that they aren't meant to muck with. mod_perl and other persistent environments depend on this in a variety of ways.

And yes, I think the last set has to win, to allow applications to modify the setting in mid-stream.

Jon

Re:Some nits to pick

Aristotle on 2007-09-08T17:46:30

You’re right, set_logger() should be renamed, though I’m not sure I can bear to use “factory”. :)

The name you are looking for is register_logger.

Any

acme on 2007-09-07T08:43:03

I'm not really a fan of the ::Any namespace - JSON::Any being particularly bad as all the JSON modules have different quirks and JSON::XS is clearly way above the rest. Why couldn't we just pick one logging module as good enough and call it best practice?

Re:Any

Alias on 2007-09-07T08:53:17

I believe the optimum number of CPAN modules for any single topic is seven :)

There's also a Log::Tiny module potentially coming (not from me, I just got the email asking to review it, and it's being munged to bring it up to ::Tiny spec at the moment).

On a side note of trivia, this is the second time I've had a proposed Log::Tiny one, in the end the first person decided he wasn't willing to cripple it enough to meet the criteria, so it ended up being called something else.

Re:Any

jonswar on 2007-09-07T18:18:25

This project arose from trying to pick a logging scheme for Mason. I decided pretty quickly to standardize on log4perl, until I talked to Dave, who understandably demanded consideration for Log::Dispatch users.

I figure if a single little project can't even decide on a best logging module, it'll be damned near impossible to get the whole community to agree. :)

why the "if" clause?

thepler on 2007-09-07T21:54:57

$log->debug("a debug message") if $log->is_debug();

I'm not sure why you want the "if" clause. I'd prefer to have the debug method just do nothing unless debugging is enabled. If you're trying to avoid a method call, you'd want:

$log->debug("a debug message") if $is_debug;

though personally I'd just prefer:

$log->debug("a debug message");

Re:why the "if" clause?

jonswar on 2007-09-08T03:37:22

You're right, that was a poorly chosen example. Yes, $log->debug() won't do anything unless debugging was enabled. Here's a more appropriate use of the conditional:

        $log->debug("current args: " . Dumper(\@_))
              if $log->is_debug();

The idea is to use the conditional when you don't want to compute the arguments (in this case, Dumper()) unnecessarily.

Re:why the "if" clause?

thepler on 2007-09-11T11:49:13

I seem to recall that one of the other logging modules (I forget which) uses a sub for that.

$log->debug(sub {"current args: " . Dumper(\@_)});

Great work, and can I have stealth loggers?

Perlmeister on 2007-09-08T20:32:58

Hi Jonathan, you're addressing something really important here.

It is really unfortunate that important CPAN modules like LWP or Rose::DB don't have usable logging mechanisms in place. Once you're used to category-based logging it's hard to go back to these home-grown and less convenient solutions.

I think it's a great idea to promote standardized logging on CPAN and letting the user choose the actual implementation will definitely help to make strides towards that goal.

By the way, I don't agree that "For small modules that want to minimize dependencies, depending on Log4perl (for example) is a non-starter." The Log4perl core doesn't have dependencies other than perl's core modules.

One thing missing from your proposal is Log4perl's :easy mode. If you think about it, getting a logger and calling its method is a lot of typing, given that you just want to log something. That's why in Log4perl you can use

DEBUG "Log this!";

and you're done. No getting a logger, no method calling, no category passing. It's all automatic. Behind the scenes, it gets a 'stealth logger' with the category set to the current package and calls the appropriate logging method on it. This feature is huge. I'm using it almost exclusively in everything I write.

Any chance of adding that?

One thing to watch out for is performance. People don't want their applications to slow down just because a module has a logging capability. Ideally, it could just be de-compiled, Log4perl has an experimental feature called resurrect which does something along these lines, although it's somewhat hacky.

Keep up the great work!

-- Mike Schilli

Re:Great work, and can I have stealth loggers?

jonswar on 2007-09-09T04:17:20

> By the way, I don't agree that "For small modules that want to
> minimize dependencies, depending on Log4perl (for example) is a
> non-starter." The Log4perl core doesn't have dependencies other
> than perl's core modules.

Fair enough.

>
> One thing missing from your proposal is Log4perl's :easy mode
> [sourceforge.net]. If you think about it, getting a logger and
> calling its method is a lot of typing, given that you just want
> to log something. That's why in Log4perl you can use
>
> DEBUG "Log this!";
>
> and you're done. No getting a logger, no method calling, no
> category passing. It's all automatic. Behind the scenes, it gets
> a 'stealth logger' with the category set to the current package
> and calls the appropriate logging method on it. This feature is
> huge. I'm using it almost exclusively in everything I write.
>
> Any chance of adding that?

Did you see this in the posting?

        As a convenient shorthand, you can use

                        package Foo;
                        use Log::Abstract qw($log);

        to create the logger, which is equivalent to the first example except that $log is
        (necessarily) a package-scoped rather than lexical variable.

So this creates a logger for you with the category set to the current package, similar to easy mode. The syntax is pretty minimal. The problem I have with the DEBUG etc keywords is that they promote inefficient behavior. e.g.

        DEBUG "Current arguments: " . Dumper(\@_);

will take the performance hit for Dumper() even when debug logging isn't turned on. This may be fine for a particular application where performance is not an issue, but I would never want to encourage any CPAN module author to do this.

In fact, in this thread on perl.module-authors,

        http://groups.google.com/group/perl.module-authors/msg/cbd5a168d5d780a8?hl=en&

it was suggested that even the usual "fast" $log->is_debug() check is too much of a hit. So I proposed we export a live $log_is_debug variable that would change dynamically to reflect the current level.

I suppose one could combine the syntaxes:

        DEBUG "Current arguments: " . Dumper(\@_) if $log_is_debug;

but at this point you are only saving 6 characters or so.

>
> One thing to watch out for is performance. People don't want
> their applications to slow down just because a module has a
> logging capability.

Exactly my point above. :)

Thanks for your positive feedback! I will definitely need to have the support of major logging frameworks such as log4perl for this module to be a success.

Re:Great work, and can I have stealth loggers?

Perlmeister on 2007-09-09T06:16:06

> So this creates a logger for you with the category set to the current
> package, similar to easy mode. The syntax is pretty minimal.

Ah, gotcha. So you would call

        $log->debug(...)

instead of

        DEBUG "..."

which is probably acceptable in terms of additional key strokes
(although it might throw off novices).

> The problem I have with the DEBUG etc keywords is that they promote
>inefficient behavior. e.g.
>
> DEBUG "Current arguments: " . Dumper(\@_);
>
> will take the performance hit for Dumper() even when debug logging
> isn't turned on.

Sure, although I don't see how

        $log->debug("Current arguments: " . Dumper(\@_));

is any different. By the way, in Log4perl you can pass a subroutine ref
to the method, which eliminates the problem:

        DEBUG sub { "Current arguments: " . Dumper(\@_) };

Efficient, but ugly :).

While you're at it, here's my pipe dream: I want something like Dtrace,
where the logging framework zeroes out the opcodes that are currently
inactive, and the interpreter rushes through them at light speed with
practically no overhead.

Re:Great work, and can I have stealth loggers?

Aristotle on 2007-09-09T15:47:51

Assertions were going to make that possible in 5.10.

Unfortunately they were recently removed from bleadperl. :-(

Any updates on this?

Hercynium on 2008-01-24T20:22:14

I just stumbled across this thread, and I must admit - it's exactly what I was thinking about for the past few weeks... I need logging, but don't want to tie my users to un-wanted dependencies. Better yet, is that the users could *choose* their preferred logging deps.

Is there any code out in the wild yet? I may have a few CPAN-bound uses for it!