All about blessed stash objects: better living through evil

scrottie on 2007-12-26T01:34:47

Typically, when you write an object in Perl, you combine two disjunct things: data, as stored in a hash, and methods (code), stored in a package. bless makes this association. Any reference type can be blessed, but these references may only be blessed into a package. Here's the trick: packages are data types in Perl, one of the 15 odd types, and you can take a reference to them: \ %{"foo::}. This means that packages can be blessed into packages. This trick involves resurrecting some of the oldest datastructures in Perl, globs and stashes, for modern, OO purposes. A whole lot of fun ensues.

Okay, I'm going to stop calling packages packages; they're created with the package statement, but they're better known as a stash, or symbol table hash.

Like hashes, stashes contain data of various types, indexed by name. When you don't use my to declare your variables, variables are also stored in the package, as was the way in the olden days. Functions declared like sub foo { ... }, the common way, also get stored in the stash. Normal function calls and normal method calls (like foo() and $ob->foo, and unlike $hash{value}->()) all operate on stashes.

Newly constructed blessed stash objects are empty of methods. Code references get copied in, initializing them as a copy of another object's code. This is a "prototype based object system", as is JavaScript's. JavaScript objects are hashes, with the key being the method name and the value being code. Since each object has (indeed, is) it's very own stash, we can use define our methods in terms of closures.

Here's some code to create one of these puppies, and to create a method function that will neatly stick closures into the stash for you. This is old code I've posted before; sorry for the dup. I'm trying to turn this into a more accessible article.

sub new {

# object setup (evil, run) my $type = shift; my %opts = @_; my $package = $type . sprintf '::X%09d', our $counter++; do { no strict 'refs'; push @{$package.'::ISA'}, $type; }; my $self = bless \%{$package.'::'}, $package; sub method ($&); do { no warnings 'redefine'; *method = sub ($&) { my $name = shift; *{"$package\::$name"} = shift; }; };

...

}


Then inside there, you can write methods like so:



my $arg;

method foo => sub { my $self = shift; $arg++; $self->avast_ye(); };


$arg is a lexical variable that the method foo closes over. Each time new gets called, a new stash is created, and a new $arg gets created, and a new coderef attached to that new $arg gets created and rammed into that new stash. New everything, each go -- that's the trick.

When you write "package" in your code, you're defining a new stash. They also autovivicate (spontaneously spring into existence by their mere mention). That looks like %{"foo::"}. Yes, that's similar to a computed hash name (and also requires no strict 'refs') but the name ends in a double colon.

my $package = $type . sprintf '::X%09d', our $counter++; -- this computes a new package name based on the existing one plus a serial number.

do { no strict 'refs'; push @{$package.'::ISA'}, $type; }; -- this forces the new package to inherit from the base one, so that it in turn inherits what it inherits.

my $self = bless \%{$package.'::'}, $package; -- this creates the object as this new stash blessed into itself.

sub method ($&); do { no warnings 'redefine'; *method = sub ($&) { my $name = shift; *{"$package\::$name"} = shift; }; }; -- prototype the method function to take as args a scalar and code and define it was stuffing that code into the stash under the given name. That's the glob syntax. Stashes contain globs. Only references may be assigned into globs, and by assigning in a code reference, a new method is created.

Stashes a derivative of hashes, but rather than containing arbitrary types, they only contain typeglobs, which may in turn contain any other type. This way, you can both a $foo and a @foo, as a stash in turn can hold one of every other type.

Instance data is then hidden away in lexical variables where subclasses can't see it. That's not always what's desired. In a blessed hash, you could write $self->{foo} to get at a data item. Since stashes only contain globs, you'd have to instead write ${ $self->{foo} }. To access an array stored in a normal blessed hash, you'd write @{ $self->{foo} }, which is the same for blessed stashes. Everything is stored by reference, including scalars, in blessed stashes. Data::Alias can make this a lot easier:

    use Data::Alias;
    method foo => sub {
        my $self = shift;
        alias my @foo = @{ $self->{foo} };
        push @foo, @_;
    };
alias gives you fully read-write variables that are aliases to data stored in the object.

Data stored in $self is actually stored the same way that local data is stored. Only the syntax is different.

Before you get started, a few caveats: methods built out of closures (which access instance data simply as $foo rather than $self->{foo}) take a lot more memory than normal methods. Stashes don't get garbage collected at all; data stored in them is considered "global", and this data includes references to the closures and references to lexical variables. It probably contains circular references. You may think about writing a DESTROY routine to tear everything down.

-scott


Package::Generator

rjbs on 2007-12-26T18:58:10

I have from time to time used Package::Generator to facilitate similar or related forms of evil. It helps hide the scary ick behind a reasonable looking method call.

Minor erratum

Aristotle on 2007-12-29T14:03:04

That looks like %{"foo::"}. Yes, that’s similar to a computed hash name (and also requires no strict 'refs') but the name ends in a double colon.

Actually, you can write just %foo:: if you’re not using computed stash names. That will work just fine under strictures.