namespace::clean off to the CPAN

phaylon on 2007-02-19T23:00:27

Yesterday (at least I think it was yesterday, excuse the laziness) I released a pragma called namespace::clean. The short explanation for what this module does is this: When use'd, it will build a list of current functions in the package, and install a handler to call after the requesting modules compile time (via Filter::EOF) in which it removes those entries from the symbol table.

The long explanation is that you can import functions into your namespace, or even declare your own functions, that won't show up as methods on calls on your classes and instances. Previously, it always felt ugly to me to import functions (read: Carp, aliased, Scalar::Util & Co, ...) into a packages namespace, so I called them by their full name, for example, Scalar::Util::weaken(). The same ugliness overwhelms me everytime I put (proper documented) functions in OO packages, so a lot of them end up as methods, even if they don't need to. That's where this module joins the game.

Here is a simple demonstration: Let's define a usual object oriented class:

package Foo;
use warnings;
use strict;

use Carp qw( croak );

sub double {
    my $number = shift;
    return $number * 2;
}

sub new {
    my ($class, $number) = @_;
    bless {number => double($number)}, $class;
}

1;

This package will contain three functions: croak, double and new. But we actually just want new available as a method. Not because of coworkers, since they should refer to the hopefully complete and well written documentation, but to prevent conflicts with sub- and future base-classes.

Anyway, after compile-time the calls to the functions are already bound in the code and the actual symbol table entries aren't needed anymore. So we put a line reading use namespace::clean; in our code at the position where all the functions to remove are defined:

package Foo;
use warnings;
use strict;

use Carp qw( croak );

sub double {
    my $number = shift;
    return $number * 2;
}

use namespace::clean;

sub new {
    my ($class, $number) = @_;
    bless {number => double($number)}, $class;
}

1;

After this, the following will work correctly:

my $foo = Foo->new(23);
print $foo->{number}; # prints 46

But these will both return undef:

my $can_double = $foo->can('double');
my $can_croak  = $foo->can('croak');

So, that's it. I'd very much appreciate some comments. I'm currently thinking about no namespace::clean; and unimport magick for being able to build excludable parts.

Update: 0.02 has just entered the PAUSE and will be available soon.


Huh?

chromatic on 2007-02-20T00:37:45

Anyway, after compile-time the calls to the functions are already bound in the code and the actual symbol table entries aren't needed anymore.

I don't buy it:

$ perl -MO=Concise,new
sub double {
    my $number = shift;
    return $number * 2;
}

sub new {
    my ($class, $number) = @_;
    bless {number => double($number)}, $class;
}

main::new:
k  <1> leavesub[1 ref] K/REFC,1 ->(end)
-     <@> lineseq KP ->k
1        <;> nextstate(main 3 -:7) v ->2
8        <2> aassign[t5] vKS ->9
-           <1> ex-list lK ->5
2              <0> pushmark s ->3
4              <1> rv2av[t4] lK/1 ->5
3                 <#> gv[*_] s ->4
-           <1> ex-list lKPRM*/128 ->8
5              <0> pushmark sRM*/128 ->6
6              <0> padsv[$class:3,4] lRM*/LVINTRO ->7
7              <0> padsv[$number:3,4] lRM*/LVINTRO ->8
9        <;> nextstate(main 4 -:8) v ->a
j        <@> bless sK/2 ->k
-           <0> ex-pushmark s ->a
h           <1> srefgen sK/1 ->i
-              <1> ex-list lKRM ->h
g                 <@> anonhash sKRM/1 ->h
a                    <0> pushmark s ->b
b                    <$> const[PV "number"] s/BARE ->c
f                    <1> entersub[t7] lKS/TARG,1 ->g
-                       <1> ex-list lK ->f
c                          <0> pushmark s ->d
d                          <0> padsv[$number:3,4] lM ->e
-                          <1> ex-rv2cv sK/1 ->-
e                             <#> gv[*double] s ->f
i           <0> padsv[$class:3,4] s ->j
- syntax OK

How does that gv lookup find double() if not in the symbol table?

Re:Huh?

phaylon on 2007-02-20T01:05:26

Hrm, point for you. Bad wording on my side. What's the thing called methods are looked up from then? :)

I honestly thought a delete $Foo::{bar} will remove the symbol table entry.

Thanks again!

Re:Huh?

Aristotle on 2007-02-20T23:49:42

You misunderstood what chromatic said. He demonstrated that Perl does not have compile-time binding, contrary to what you are saying. Decompiling shows that the invocation of double in the new method is indirected via the package. If you remove the entry for double from the package, the function call will NOT work.

Sorry, but your approach won’t work.

Here’s a pattern for you to read carefully and chew on:

package Foo::Bar::Internals;

use Carp qw( croak );

sub double { ... }

sub Foo::Bar::new { ... }

1;

Re:Huh?

phaylon on 2007-02-21T11:31:45

So, you're saying what? I can understand that as either

  • Bad wording on my side, and it's not removed from the symbol table, but from some other table used for method lookups.
  • I'm doing voodoo, because the things I do in the test cases can't work.

And btw: You call that a pattern? I call that a work-around :) And, just FYI, you might want to stay away from patronising phrases like "Here’s a pattern for you to read carefully and chew on." Because it really decreases my motivation to answer.

Re:Huh?

Aristotle on 2007-02-21T12:44:36

You said:

So, you’re saying what?

It’s bad wording on your part to say the functions are bound in the code, because they’re not; they’re always looked up from the symbol table.

Interestingly, what you’re doing shouldn’t work – but it does! Apparently the %main::-type hashes aren’t actually an interface to the symbol table, they’re just a one-way mirror:

sub xx {
    my $in_stbl = *main::foo{CODE} ? 1 : 0;
    my $in_hash = main->can("foo") ? 1 : 0;
    print "$in_stbl $in_hash\n";
}

xx();

eval "sub foo {1}";
xx();

delete $main::{foo};
xx();

eval "sub foo {1}";
xx();

__END__
0 0
1 1
1 0
1 1

In fact you won’t even get the Subroutine %s redefined warning if you have deleted the entry from the hash and redefine the routine.

But this is clearly inconsistent with the documentation. perlmod says:

The symbol table for a package happens to be stored in the hash of that name with two colons appended. […] The value in each entry of the hash is what you are referring to when you use the *name typeglob notation.

Obviously not. Smells strongly like a bug to me (at the very least like a doc bug), not like behaviour that should be relied on. Someone alert the porters…

You said:

You call that a pattern? I call that a work-around :)

Errm, that’s what patterns are: formulaic workarounds for deficiencies in a language.

And hey:

  • It doesn’t add yet another dependency to my code.
  • It is obvious, self-documenting. You don’t need to read module docs to understand what’s going on.
  • That it works does not contradict any Perl 5 docs.

Besides, if you call that a workaround, then pretty nearly every way of doing OO in Perl 5 is a workaround of some sort. (More precisely, you need some form of pattern for even the simplest OOP approach in Perl 5.) We’ve just gotten so used to it by now that we don’t notice.

You said:

you might want to stay away from patronising phrases

Maybe I wrote that just because the example I gave is less than obvious in my opinion. I could say you might want to stay away from reading too much into others’ utterances, but that shall be your call.

Re:Huh?

phaylon on 2007-02-21T13:02:45

Interestingly, what you’re doing shouldn’t work – but it does!

Of course. Did you honestly think I sent some code off to CPAN without even _trying_ first? :)

Obviously not. Smells strongly like a bug to me (at the very least like a doc bug), not like behaviour that should be relied on. Someone alert the porters…

Maybe I will post it there if nobody else does.

Errm, that’s what patterns are […]

That might be what you (and MJD) think, but I clearly see a difference between using an iterator pattern and creating two packages for one class.

Besides, if you call that a workaround, then pretty nearly every way of doing OO in Perl 5 is a workaround of some sort.

Well then. To stop the rethoric ride, I'll skip fast forward: Every programm is a pattern and a sum of patterns.

Maybe I wrote that just because the example I gave is less than obvious in my opinion. I could say you might want to stay away from reading too much into others’ utterances, but that shall be your call.

Exactly, and the above was my response. It just sounded a bit too much like "Don't do that. Behold! Here's the right way" to me. Besides, if _that_ example is less than obvious, how can you state above that this approach is "obvious, self-documenting" and "You don’t need to read module docs to understand what’s going on."?

If I misunderstood something, I'm honestly sorry, I just say how I received it.

Re:Huh?

Aristotle on 2007-02-21T14:47:48

Every programm is a pattern and a sum of patterns.

No, it’s not. A pattern is a common, complex arrangement of language primitives that has to be aligned just so in order to work; which arrangement addresses a particular problem commonly encountered when using the language.

Far from everything in a program does meets this definition. Most notably, the solution to the problem addressed by the program in its essence is not a pattern, by definition, although if it’s non-trivial you will often employ patterns (ie. artifices) while solving the problem.

I clearly see a difference between using an iterator pattern and creating two packages for one class.

Sure, iterators are a profound construct and a workaround for poor Perl 5 semantics isn’t, but in terms of their “pattern-ness” they are equals. It’s easy to imagine a language which has direct syntactic and semantic support for declaring certain things methods vs functions and public vs private; likewise it’s easy to imagine a language which has direct syntactic and semantic support for iterators so that you don’t have to compose them from other primitives.

if that example is less than obvious, how can you state above that this approach is “obvious, self-documenting” and “You don’t need to read module docs to understand what’s going on”?

In the example I gave (which I wrote that way to stay close to the one you gave), there’s only a single sub which kinda gets lost in the noise of 5 other declarations and it might be easy to miss the package name mismatch. If there are 10 subs written with a fully qualified package name and they’re all grouped together away from the utility functions, then it’s much easier to notice that something special is going on. That’s why I claimed that it’s self-documenting.

I don’t know how to explain why I claim that you don’t need to read module docs to understand it, because, well… you don’t to read docs where no module is involved and the semantics of the pattern are self-contained. If that’s not obvious to you, then I don’t know what I can say to help.

Re:Huh?

phaylon on 2007-02-21T14:56:26

Well, everything's said I guess and it would be useless to repeat it, especially since we're both drifting off in the ad-hominem area.

Re:Huh?

rafael on 2007-02-21T14:32:57

Replace delete by undef in your example, and it works. You can't delete subs from a stash in Perl 5.8, only undef them. If you use delete, the sub still exists (you can use "exists &foo" to test for it). That can be considered as a bug.

CPAN modules are not pragmas

perrin on 2007-02-20T20:01:45

I wish you had given this normal capitalization and not called it a pragma. Pragmas by definition are shipped with core perl. This may be a useful module, but calling it a pragma only muddies the waters.

Re:CPAN modules are not pragmas

phaylon on 2007-02-21T11:46:31

It hadn't occured to me yet and, in fact, I had to look for a pretty long time before I found the term "default module" in perlglossary for pragma. I think you mean that sentence. That was a fault on my part, so sorry for that.

But there seems to be oil all over the ocean already: autorequire, version (will be ok then in 5.9 I guess), fake, all, …

Re:CPAN modules are not pragmas

perrin on 2007-02-21T16:09:05

I don't mean to single you out. You're certainly not the only one to do it. I think it's better to stick with the standard though, and avoid calling CPAN modules pragmas or naming them all lowercase.

Re:CPAN modules are not pragmas

phaylon on 2007-02-21T16:17:13

No worries, I'm not feeling singled out :) Though in my opinion it would have been more confusing to name it Namespace::Clean and affect the local package the way it does. I, and I had the impression that many others too, understand the term 'pragma' more as 'altering behaviour or environment' vs. the 'providing code' situation of library modules.

This is a bug

jjore on 2007-02-21T19:15:30

This is a bug that it works. It's only semi-unlikely to get fixed. I've things I'd like to do that need this bug to get fixed so I may spend some time to ensure that this bug "works" appropriately.

There's two related bugs here. If you localize or delete the stash a function exists in, the GV pointed to by the pp_gv opcode has a pointer to the original stash. This makes it oblivious to wholesale replacement of the stash.

It appears the other bug is that that the pp_gv opcode has a pointer to the GV and retains it even if the stash no longer contains that GV.

Both of these bugs are preventing two kinds of dynamisms from working. I've an interest in fixing one and it might make sense to fix both at the same time.

sub foo { 42 }
delete $::main{foo};
print foo(); # This had *better* die