Object::ID - A unique object identifier for any object

schwern on 2010-05-02T17:17:01

Something Perl's OO has been missing has been a reliable way to identify an object. Is $this the same as $that? Not asking if it contains the same information, but is it a referent to the same object? Have we seen it before? When I alter $this will I also be changing $that?

package Foo;

use Object::ID;

...write the class however you want...


Really, HOWEVER YOU WANT! Inside out, outside in, code refs, regexes, globs, Moose, Mouse... Call the constructor whatever you like, add in a DESTROY method. Doesn't matter, it'll work.

my $id   = $obj->object_id;
my $uuid = $obj->object_uuid;


object_id() is a cheap, process-specific identifier. object_uuid() is a bit more expensive on first call (it has to generate the UUID, about 30% slower) but it should be universally unique across machines and processes.

That's great for YOUR objects, but what about everyone else? You can either inject the Object::ID role one class at a time...

package DateTime;
use Object::ID;

my $date = DateTime->now; say $date->object_id;


or you can load UNIVERSAL::Object::ID and every object has it. EVERY OBJECT! Even things you don't realize are objects.

use UNIVERSAL::Object::ID;

# Regexes are objects say qr/foo/->object_id;

# Loading IO::Handle turns all filehandles into objects use IO::Handle; open my $fh, "foo/bar"; say $fh->object_id;


But OH GOD UNIVERSAL! Well, use at your own risk. Its handy to use in your own programs and private libraries. Or you can use Method::Lexical and apply it lexically.

Why not just use the object's reference address? Well, as people implementing inside-out objects discovered, they're not unique. They're not thread safe, and worse they're not even unique for the life of the process. Perl will reuse the reference of a destroyed object. Observe:

{
    package Foo;

sub new { my $class = shift; return bless {}, $class; } }

for(1..3) { my $obj = Foo->new; print "Object's reference is $obj\n"; }


Run that and you should get the same reference, three times, for three different objects.

And then there's the problem of string overloaded objects. You have to be careful to always use Scalar::Util::refaddr or overload::StrVal.

It turns out inside-out objects have nearly the same problem, and 5.10.0 introduced field hashes to solve that. rjbs explains the pain of all this at slide 120 in his excellent 5.10 For People Who Aren't Totally Insane. You can read the gory details of field hashes but it comes down to this: in 5.10 you can A) get a process unique, thread safe identifier for an object and B) you can store it in hash such that it gets destroyed when the object is destroyed. Perfect!

Because of this, if you look inside Object::ID you'll see there's not a lot to it. It makes a field hash to store the IDs in, a state variable to hold an ID counter, and then just accesses the field hash.

use Hash::Util::FieldHash qw(fieldhash);
fieldhash(my %IDs);

sub object_id { my $self = shift;

state $last_id = "a";

return $IDs{$self} //= ++$last_id; }


No scary black magic (beyond what's inside fieldhash). Its so simple, which is why it works with everything.

Now, I didn't come up with this implementation. I just laid out the requirements and Vincent Pit filled in the blanks. I was only vaguely aware of field hashes, Vincent made the connection. Thank you VPIT!

Practical applications? Honestly, I'm not sure. I needed it as a shortcut for expensive object equality checks in perl5i. Maybe some of the OO theorists out there can fill this part in. Let me know what you might use it for.

Possible extensions? Well... with some tweaking Object::ID can be used as a universal object registry. Not only can you ask "does the object associated with this ID still exist" but field hashes provide the ability to get the object associated with an ID. It would only work on objects that have had their ID asked of them, and thus registered with the field hash, but how else would you have the ID? Is this useful? Is this a security hole? I dunno, but it would be easy.


UNIVERSAL

rjbs on 2010-05-02T19:16:26

Perhaps you could add something like
Object::ID->object_id_for($object)

Or use Object::ID qw(obj_id);

These would both avoid UNIVERSAL and avoid needing to add *anything* to *any* class *ever*.

Every time someone puts something in UNIVERSAL, %%RND_BAD_THING%%.

Re:UNIVERSAL

schwern on 2010-05-02T19:55:48

All the objections associated with UNIVERSAL::isa($obj, $class) vs $obj->isa($class); come to mind. Why would you override object_id()? Its too early to say. Apparently we thought the same thing about isa() and can().

OTOH maybe someone might write their own object_id() method that does something different and you'll accidentally get that? Entirely possible, but turns out to be highly improbable. A Google Code Search there's only a handful of object_id() methods out there (I cut out BioPerl, Moco and Pogo because they artificially inflate the count).

Anyhow, this sort of feather ruffling is why UNIVERSAL::Object::ID is in its own package.

FWIW you can already call object_id($obj).

Re:UNIVERSAL

rjbs on 2010-05-02T21:44:25

"FWIW you can already call object_id($obj)."

Oh, of course, because although it is advertised that you're importing it to be called via your package as a method, it can be imported in for use as a function. Hooray!

Re:UNIVERSAL

Aristotle on 2010-05-02T23:20:37

You can check in object_id_for if the object provides a method can trampoline to that. Then you get to have it both ways.

Re:UNIVERSAL

schwern on 2010-05-02T23:58:08

True, true.

Now that we've all had our OH GOD UNIVERSAL time, what do you think of the actual module?

Hash::FieldHash

Ron Savage on 2010-05-03T00:35:33

Hi

Why use the heavy-weight Hash::Util::FieldHash when the light-weight Hash::FieldHash is available?

Re:Hash::FieldHash

schwern on 2010-05-03T18:36:44

You're right, it is significantly faster. Over 2x faster. I'll do some more testing and switch it over.