A Growing Dislike for SQL-OOP Mappers

jk2addict on 2005-08-22T13:59:00

First, this isn't a statement about the quality of Class::DBI or DBIx::Class. They do what they're intended to do; well at that. I use Class::DBI and I'm starting to use DBIx::Class. If it weren't for sql->oop mappers, I never would have gotten this far this fast with Handel.

With that said, I'm really starting to dislike them. I think there's a dark side of them that no one talks about most of the time.

Yesterday I started a new quick and dirty photo gallery project under Catalyst. Before I even started anything web related, I wanted to get the core modules out of the way. For the point of this conversation, two of the main modules are Gallery and Photo; each tied to the galleries and photos tables in the database. One of the side goals was to be able to use Class::DBI or DBIx::Class as my sql abstraction layer. Through some use of base clases and some symbol table tom foolery, this actually worked without too much hassle.

But, here's where the dark side somes in: these mappers tend to have their own new/create subs and use them in their internals. That means that if you want you're own new/create sub to do things in addition to creating a new record, there's a good chance it just won't work right. Here's an example:

    package MyApp::DBI;
    use base 'Class::DBI';
    __PACKAGE__->connection(...);
    1;

    package MyApp::Gallery;
    use base 'MyApp::DBI';
    __PACKAGE__->table('galleries');
    __PACKAGE__->columns(...);

    sub create {
        my ($self, $ops) = @_;

        my $dir = $ops->{'directory'};

        if (-e $dir) {
            die 'directory exists';
        else {
            mkdir $directory;
            $self->find_or_create($data);
        };
       return;
    };
    1;

MyApp::Gallery->create will fail hard every time because the CDBI internals call create, but not the one you wish it would. In the case of Handel, I used new to create a new object (what a name for a constructor). A this point, It's just luck I didn't want a custom create() sub in my public API. More on public APIs later.

A similiar problem exists when using DBIx::Class and new:

    package MyApp::DBI;
    use base 'DBIx::Class';
    __PACKAGE__->connection(...);
    1;

    package MyApp::Gallery;
    use base 'MyApp::DBI';
    __PACKAGE__->table('galleries');
    __PACKAGE__->add_columns(...);

    sub new {
        my ($self, $ops) = @_;

        my $dir = $ops->{'directory'};

        if (-e $dir) {
            die 'directory exists';
        else {
            mkdir $directory;
            $self->find_or_create($data);
        };
       return;
    };
    1;

Since I'm trying to use either CDBI or DBIC, my problem is doubly compounded in terms of new/create. I would also have to tweak my new() to call insert() under DBIC, which breask the whole point of encapsulating the DBI layer.

So, here's my beef. These SQL->OOP mappers need to make their create/new subs private (_new, _create) in some way shape or form; yet still be subclassable. Breaking new() for the consumer is a sin in my book since it's the most common creation idiom out there. Using a mapper should NOT decrease my options in terms of what public subs I want in my API. The only thing I really need an sql->oop mapper to do is to provide methods to get/set fields, otherwise stay out of my way. There, I feel better.

Even if I weren't trying to use either/or dbi mapper, using one or the other with a custom new/create still could cause problems. There is a solution, but it means an extra layer of abstraction; sort of like the old n-tier approach under windows. MyApp::Gallery, MyApp::DBI::Gallery subclassing MyApp::DBI.

Let's look back to the CDBI example:

    package MyApp::DBI;
    use base 'Class::DBI';
    __PACKAGE__->connection(...);
    1;

    package MyApp::DBI::Gallery;
    use base 'MyApp::DBI';
    __PACKAGE__->table('galleries');
    __PACKAGE__->columns(...);
    1;

    package MyApp::Gallery;
    BEGIN {
        use 'MyApp::DBI::Gallery';

    };


    sub create {
        my ($self, $ops) = @_;

        my $dir = $ops->{'directory'};

        if (-e $dir) {
            die 'directory exists';
        else {
            mkdir $directory;
            return MyApp::DBI::Gallery->find_or_create($data);
        };
       return;
    };
    1;

Now I've seperate my Gallery object from it's DBI layer. No sub name collisions. No CDBI/DBIC internals calling my subs when they really want theirs. Yeah!

With some tweaking, you could rebless that new gallery dbi instance into the current package and map the column accessors into the current package. Either way, there's no longer a collision of ideas. Gallery->new and Gallery->create are always mine and mine alone. There are no internals anywhere accidentally calling my subs.

Of course, I could be wrong. :-)

Now for my second sql->oop mapper rant. They foster bad software design. Before you flame me, think about this:

    package MyApp::DBI;
    use base 'Class::DBI';
    __PACKAGE__->connection(...);
    1;

    package MyApp::SomeTable;
    use base 'MyApp::DBI';
    __PACKAGE__->table('sometable');
    __PACKAGE->has_many(tablelegs => 'MyApp::TableLegs');

    1;

Now, how does the consumer of your module add legs to a table? That's right, the CDBI specific method ->add_to_tablelegs({}). That's silly. What if I change the DB layer? You're stuck with a cruddy API, even if you map add_to_tablelegs to SomeOtherDBILayer::add_child_object(). MyApp::SomeTable should just have an add() method to abstract that out. How many people do that that use CDBI? Who knows. The same goes for find_or_create, searc_linke, and all the others. They're convience methods for the module writer, but should NEVER be exposed to the module consumer. The MyApp::Gallery, MyApp::DBI::Gallery ==> MyApp::DBI example above keeps the author and the consumers honest to a purposeful API.

more thoughts..

   package MyApp::Gallery;
   use MyApp::DBI::Gallery;

   sub new {
     my ($class, $opts) = @_;
     my $gallery = MyApp::DBI::Gallery->create($data);

     bless {gallery => $gallery), $class;
   }

   sub AUTOLOAD {
      my $self = shift;
      my $name = our $AUTOLOAD;

      if (UNIVERSAL::can($self->{gallery}, $name) {
          $self->{gallery}->$name(shift);
       };
   };

Has-a, not is-a

autarch on 2005-08-22T15:10:40

That's one of the main problems with those two modules, they force you into an is-a relationship. The right way to do it, IMO, is a has-a relationship. Your OO module, which provides your public API, should _have_ some sort of SQL-ish module to do the work.

Now that mapper can create whatever methods it wants, and you don't care. I don't think this has much to do with mappers in general as much as the ones you've chosen, which force you to subclass them.

Re:Has-a, not is-a

tmtm on 2005-08-22T20:34:08
I'm confused by this statement. Class::DBI doesn't force you into an is-a model at all. A Class::DBI class merely represents a database table. It can quite happily be the class that your nice pure OO model class 'has'. Class::DBI provides your "sort of SQL-ish module to do the work". It's not trying to be part of some nice OO system.

Tony

Re:Has-a, not is-a

autarch on 2005-08-22T21:59:53
What you say is true. It'd probably be good if the docs recommended this sort of usage, for exactly the reasons that the original journal entry mentioned.

But in that case most of your C::DBI classes would _just_ be the declarations, which isn't how I've ever seen anyone use it.

Use Method Dispatch!

chromatic on 2005-08-22T16:08:34

I agree with you and Dave about the fragile inheritance problem (and wonder if adding hook methods or migrating to a Class::Std-based object model would solve some of these problems.

However, one of your suggestions makes a similarly unwarranted assumption, specifically:

if (UNIVERSAL::can($self->{gallery}, $name) { $self->{gallery}->$name(shift); };

What if I want to overload can()? It's a method. Why shouldn't I be able to do that?

Re:Use Method Dispatch!

jk2addict on 2005-08-22T16:30:20
I'm not sure how to take or answer that question. So I should never use UNIVERSAL::can because you may want to override it? :-)

Of course, I didn't have to check the method I'm referenceing exists at all. It just seemed like a safety measure. But then again, if you override anything in UNIVERSAL, its effects are your problem, not the modules authors problem right?

Or, was this comment geared towards the UNIVERSAL::can vs. $object->can debate that flares up from time to time? :-)

Re:Use Method Dispatch!

chromatic on 2005-08-22T18:52:57

Yes, it's the great "Why in the world are you explicitly calling the parent implementation of a method on an instance of a known-derived class?" debate. Someday I'm going to write UNIVERSAL::new and write code that calls that directly instead of the actual constructors just to show how stupid a debate it is.

Re:Use Method Dispatch!

jk2addict on 2005-08-22T19:40:20
Easy killer. :-)

I'll admit, I'm a tard when it comes to that debate. I've heard mutterings and seen people use can both ways. In an effort to make me less stupid, care to elaborate on the difference between the two in the context of that debate?

Re:Use Method Dispatch!

chromatic on 2005-08-22T20:20:19

can() is an object method. There's a default implementation in the UNIVERSAL base class so that classes that don't need to define their own implementations can inherit it.

Classes for which the default implementation is unsuitable can override it with their own implementations -- and there are good reasons for doing so, such as if you have a proxying or delegation relationship you want to keep transparent, if you autogenerate methods you can't know at compile time and don't necessarily want to install them in the symbol table before other code calls them, and so on. As long as the overridden method provides the same interface as UNIVERSAL::can(), the location of the implementation and the mechanics of the implementation are transparent to the caller. That's one important point of OO.

By calling can() as a method on an invocant, everything is fine. This will dispatch the right method regardless of whether the class has inherited the method from UNIVERSAL, from a parent somewhere, or has implemented its own. A class that has no special needs for this method will always answer the question "Do you respond to a method of this name?" appropriately. So will a class that needs to do different work to answer the question correctly.

Calling UNIVERSAL::can() as a function on an invocant explicitly forbids the invocant's class from overriding can(), even if it makes sense for that class to do so -- even if the class has already done so! This will sometimes answer the question incorrectly and sometimes will break perfectly legitimate code written by other people.

The other position in the debate is basically "So what? I don't care if code breaks", which I completely fail to understand and consider myself unqualified to summarize in more detail.

Re:Use Method Dispatch!

jk2addict on 2005-08-22T20:28:38
Thanks for that. Now I get it.

Would it be safe to assume that I should always use ->can against other classes, but if I was overriding and making my own can() sub, that the guts of my new and improved can() should use UNIVERSAL::can; or should it use $self->SUPER::can?

Re:Use Method Dispatch!

chromatic on 2005-08-22T21:37:22

It's a normal method, so treat it as you would handle any inherited method. If you use SUPER, the implementation in the parent can change but the child can continue to work without change.

Re:Use Method Dispatch!

rjbs on 2005-08-23T12:02:52
Using UNIVERSAL::can() as a function is unfair to objects, which might implement their own can method. Using the can method, however, is a real pain, because one can't just say:

if ($scalar->can("explode")) {

because the program will die unless $scalar is an object or package name. If it's an unblessed reference or a simple scalar, calling a method throws an exception, so one ends up writing:

if (ref $scalar and blessed($scalar) and $scalar->can("explode")) {

which is just a lot of typing, and we know how Perl progammers feel about lots of typing. Maybe Scalar::Util needs to provide able($scalar, $method) that does the above.

Re:Use Method Dispatch!

Matts on 2005-08-23T12:36:52
The first ref($scalar) there is redundant, so it's a *bit* less typing ;-)

Re:Use Method Dispatch!

rjbs on 2005-08-23T13:03:55
You're right, this is my fault for waffling. I was going to say:

if (ref $scalar and (not grep { ref $scalar } @std_types) and $scalar->can('explode')) {

but then I decided that it was fine to assume, for the example, that Scalar::Util was available, and that I was being overly snarky.

By the way, note the use of UNIVERSAL::isa in perldoc -f ref.

Re:Use Method Dispatch!

chromatic on 2005-08-23T15:37:59

Don't worry; Rafael applied my patch to perlfunc a little while ago.

Re:Use Method Dispatch!

malte on 2005-08-30T11:19:40

able is a good idea. Plus we need an option for Scalar::Util:
use Scalar::Util export => "able", to_namespace => "Customer::ProjectName"

Re:Use Method Dispatch!

btilly on 2005-08-25T01:55:39
I think that you're unfairly summarizing the other position in the debate.

The other position as I see it can be summarized as, I think it more likely that some idiot who doesn't know about UNIVERSAL::can will write a can method that does the wrong thing, than that some smart guy will write a can method that works where UNIVERSAL::can would have failed.

Which is a vote of non-confidence in others, not a desire to see things break.

Re:Use Method Dispatch!

chromatic on 2005-08-25T17:39:31

I can only see that argument as "Because someone else might write broken code, I'll deliberately write broken code."

Re:Use Method Dispatch!

btilly on 2005-08-28T02:59:21
Just because the argument looks silly doesn't mean that it is wrong.

See this thread where multiple good programmers, having just been told what some of the problems are trying to mix AUTOLOAD and can, tried to write a can routine and got it wrong.

If good programmers who have just been told some of the big pitfalls can't do it, then why should I expect that anyone else will get it right?

The problem isn't with the maintainance programmer. It is with how a couple of independent features do not combine in any reasonable way.

Re:Use Method Dispatch!

chromatic on 2005-08-28T08:26:16

If someone needs to override can() and does it incorrectly, calling UNIVERSAL::can() will give the wrong answer. You can't fix that without fixing their code. That's their bug.

If someone needs to override can() and does it correctly, calling UNIVERSAL::can() will still give the wrong answer. How is that possibly good? What possible legitimate reason is there for recommending it? This technique has no chance of working correctly in the face of an overridden can()! That's your bug.

For the record, I'm not talking about a multiple inheritance situation here. I know I could not easily write code that behaved correctly with important AUTOLOAD() methods in multiple parents.

Re:Use Method Dispatch!

btilly on 2005-08-28T10:26:21
I think that I said it fairly well here. From my point of view, can is not to be considered fully trustworthy or reliable. Since I don't consider it reliable, I do not rely on it, and I don't waste much energy worrying about whether I'm breaking it.

By contrast you view it as something that is documented and supposed to work. If someone breaks it, you expect them to fix it. And since you expect people to fix it, you get upset if someone else bypasses that fix.

To me that seems like a lot of stress and effort to go through to try to reintroduce a level of certainty that I start with by assuming that can is unreliable.

I know that my position upsets you. But from my point of view, you're like the people who want to use return codes for errors who complain that people don't get the fact that if they are just careful enough, that this is as reliable as anything else. I know that I'm not that careful, nor do I think that I can reasonably expect others to be. So I'll pick a technique that I know works. (In the case of errors, exceptions. In the case of can, other than in toy code, I've never actually needed it.)

Sick

jk2addict on 2005-08-22T17:29:24

Here's the really sick part about the MyApp::Gallery, MyApp::DBI::Gallery ==> MyApp::DBI example above...

I could use Class::DBI, DBIx::Class, Rose::DB::Object, or my own raw SQL. I can change DBI layers at will and not have to fight my MyApp::Gallery class or whatever is-a relationship it would've had normally.

There's nothing new about this approach. True n-tier development does that forever. But in the lasd of "just use base 'Class::DBI' in your class and go", it's somewhat interesting.

Get out of the way

tmtm on 2005-08-22T20:39:26

The only thing I really need an sql->oop mapper to do is to provide methods to get/set fields, otherwise stay out of my way.

You really don't want to be using Class:DBI, then...

Tony