Catalystic conversion - part 5

autarch on 2007-09-01T04:15:07

So I've been hacking away happily with Catalyst on my VegGuide.Org 3.0 project for quite some time, and I'm still liking Catalyst, which is a good thing ;)

When I first started this project, I also looked at Jifty. It looked neat, but ...

Jifty, like Rails, is opinionated software. It has opinions about what ORM you should use, as well as what templating system you should use. It's ORM has opinions about your schema design, much like Rails (ugh, id is the worst column name evah!).

There's nothing wrong with opinionated software. I'm certainly opinionated myself. The problem with opinionated software is that it's really only appropriate for writing an app from scratch.

The main focus of my VegGuide 3.0 project is to completely revamp the UI of the site, and improve the user experience. I'm also aiming to make the system as RESTful as possible. Along the way, I'm slipping in some features like Google Maps integration, but mostly it's about UI & REST.

The existing VegGuide code uses Alzabo as its ORM. I like Alzabo, since it mostly does exactly what I want, though I'm not sure I'd recommend it to others for new projects these days. Nonetheless, I don't want to switch to another ORM as part of this project.

All this gets back to the opinionated software thing. I could've tried to integrate my Alzabo-using classes with Jifty, but it would've been a huge (perhaps monumental) amount of work. Catalyst, OTOH, has very few expectations. In fact, the core Catalyst code has no expectations of a model, you don't even have to have one!

This has made revamping the controller and view (UI) pieces of my app much easier, because I can reuse my existing model code without any changes to accomodate my framework. In a few cases, I've had to make some custom bridge code to make plugins (Session, Authz, etc) work with my model, but this has really been trivial to implement.

With Catalyst, you can convert an existing app to Catalyst without rewriting every aspect of the app in one fell swoop. That is a huge plus for Catalyst in my mind, since 100% rewrites are rarely possible. This may in fact be the number one selling point of Catalyst. Better code for old apps!

"id" fields

Ovid on 2007-09-01T09:44:16

Why do you say that "id" is the worst column name ever? While I know that some people have issues with surrogate keys, I find them to be very useful so long as people understand that the removal of surrogate keys should still leave sets of records instead of bags (of course, this also means that people should understand that "real" primary keys are not the same as surrogates, but the latter frequently makes working with our current crop of non-relational databases much easier).

That being said, I know you're savvy enough with databases that you are probably seeing something I'm not and I'm keen to hear your viewpoint.

Re:"id" fields

sigzero on 2007-09-01T13:17:55
I don't like "id" either but I understand why it is used.
Re:"id" fields

autarch on 2007-09-01T15:22:10
The reason I dislike name "id" for a primary key has nothing to do with being against surrogate keys. Surrogate keys are not ideal, but I use them often. Sometimes I do so because I simply cannot expect to collect any unique data for each entry, sometimes they just make my life simpler because SQL sucks.

The problem with naming your primary key "id" is twofold. One, when you want to refer to that key in another table, you have to use a different name. For example, you might have User.id in one table and Message.user_id in another. I just don't like this, as it's more confusing than necessary.

More importantly, having multiple columns with the same name across your schema implies that these columns are somehow the same type of thing (User.id, Message.id, Group.id), but they are different types! In most schemas, you cannot reasonably compare User.id and Message.id and make any conclusion based on that comparison. The name is basically part of a column's type, and things which are different should have different names.

As an aside, if modern DBMSs made defining custom types easier, I'd want to create a new type for every surrogate key, deriving from Integers, but which only allows the "==" operator (no math, no ""). These surrogate keys may look like integers, but from a type perspective they're not numbers. Having a different type for user_id and message_id would prevent you from comparing them, which could catch typos. Of course, since I name these columns clearly, I'm unlikely to make that sort of type, whereas if they were all named "id" that mistake becomes much more likely ;)

Re:"id" fields

Ovid on 2007-09-01T17:16:04

I completely agree that every id should be a different type and thus not comparable, but if you have SQL using two tables with id fields, you have to fully qualify them if you reference them or else the ambiguity is a compile-time error. As a result, you can at least see that comparing customers.id and orders.id doesn't make sense.

The type problem, of course, is more serious than this. Someone not being careful might not notice that customer.age should not be comparable to customer.weight, but in the case of an id field, it should be a more glaring error. What would be an appropriate way of handling with within the confines of SQL limitations? I certainly wouldn't name something customers.cust_id as that's redundant (it's even more problematic if you must rename the table as then you probably should rename the key and thus increase the maintenance nightmare). From my experience, it's only non-id fields where the comparison mistake is more likely to occur.

Re:"id" fields

autarch on 2007-09-01T21:30:55
I would definitely name the surrogate key in a Customer table customer_id. It's not really the case that you alyays have the table name in front of the column name, and I like the clarity. Similarly, if I have a customer object I'd rather type $customer->customer_id() than $customer->id(). Again, it distinguishes it from other types of ids that the table (or object) may have, and this way the column names are always the same, so the Order and Customer tables both have a customer_id column.

As far as types go, with Postgres, it's actually pretty easy to make your own types based off the built-in types. Unfortunately, you can only add constraints on top of the built-ins, you can't remove operators, so you could make a customer_id domain but it'd still compare to other domains based off integers.

To make a real type you have to code it in some other language, which is incredibly lame. On the cool side, with Postgres that other language could be Perl, Python, or anything else which plugs into Postgres, not just C. I've never done this but I'd like to experiment with it. I wonder what sort of speed hit there'd be though.

Re:`id` fields

Aristotle on 2007-09-02T11:14:11

it’s even more problematic if you must rename the table as then you probably should rename the key and thus increase the maintenance nightmare

That argument doesn’t really hold up, though. No matter how you turn it, renaming tables is a maintenance nightmare anyway: you will already have to rename all the FKs to the table that’s being renamed.

Re:`id` fields

Aristotle on 2007-09-02T11:09:59

For example, you might have User.id in one table and Message.user_id in another. I just don’t like this, as it’s more confusing than necessary.

Personally I tend to use id for the surrogate key and then $adjective$table for foreign key columns. So in your case I’d have user.id for the PK and message.authoruser for the FK. I pretty much stumbled onto this scheme by accident, but it has stuck since – I found that having a uniform way of expressing not only the existence of a relationship between tables but its nature as well makes my schemata more self-documenting.

Woops

Aristotle on 2007-09-02T11:17:02

Please disregard this comment, there was a markup error in it that I didn’t spot in the preview.

Re:`id` fields

Aristotle on 2007-09-02T11:11:24

For example, you might have User.id in one table and Message.user_id in another. I just don’t like this, as it’s more confusing than necessary.

Personally I tend to use id for the surrogate key and then $adjective_$table for foreign key columns. So in your case I’d have user.id for the PK and message.author_user for the FK. I pretty much stumbled onto this scheme by accident, but it has stuck since – I found that having a uniform way of expressing not only the existence of a relationship between tables but its nature as well makes my schemata more self-documenting.

Apple and Oranges

sigzero on 2007-09-01T14:39:06

I would point out the Jift and Catalyst have differing philosophies so it makes it hard to compare them. You can say well they both have ORM capabilities but they have those capabilities for different reasons and from different point of views.

Re:Apple and Oranges

autarch on 2007-09-01T15:23:00
Well, that was more or less my point. But I don't think they're apples and oranges from most people's perspectives. They're still both frameworks for making web apps.

Re:Apple and Oranges

sigzero on 2007-09-02T00:02:22
I don't agree. I know they are both "frameworks" but then I have to look deeper and see which one is closer to my philosophy to use.

Re:Apple and Oranges

autarch on 2007-09-02T03:27:34
Well, every framework is different, and you have to look more deeply at each one to figure out how they differ. Nonetheless, Jifty and Catalyst are more like each other than, say, GNOME, and if you're looking to build a webapp, or move an old one to a more modern technology, you'd be better off looking at Jifty and Catalyst (and Rails, Pylons, Django, etc) than GNOME.

Re:Apple and Oranges

sigzero on 2007-09-02T17:22:05
lol

I can't argue with that... : )