Are Object-Relational Mappers Worth It?

Ovid on 2008-06-03T09:28:22

I know this has been a long-standing area of debate and has been discussed to death, but I just have to get this out of my system since it's bugging the heck out of me.

We use DBIx::Class on our project. It's a great ORM. For my last job, I chose Rose::DB because I needed the performance. I've recently heard that they're ripping it out and replacing it with DBIx::Class -- for performance reasons. Allegedly all of the hacking I had to do to get classes to share database handles (it doesn't by default), support transactions and force all attributes to be private by default slowed it down to the point where it offered no speed advantage. I never saw the benchmarks, but I put a lot of work into ensuring that Rose fit our company standards and now other programmers are putting a lot of work into replacing it. So that's lots of work on things which are only indirectly productive.

I should add that while I have been generally positive about ORMs in the past, I keep seeing the difference between theory and reality. Most projects I've worked on which have heavy ORM usage keep fighting against the ORM, but few question whether or not it's the ORM itself which is the issue. When thinking about ORMs, the end goals we should focus on are:

  1. Improving maintainability.
  2. Improving performance.

For many projects, I think the ORM inhibits achieving either of those goals.

Problems:

  • The object-relational impedance mismatch is a well-known problem. Regardless of what hoops we jump through, this mismatch will continue to exist.
  • Objects, as agents of responsibility, should be correct, complete, but minimal (see this Dr. Dobbs article. It has some issues, but highlights the problem). However, ORMs typically try to do more than just provide an OO wrapper for the database, they also try to be a domain layer, thus being far from minimal and raising complexity. This also results in a lot of duplication of responsibilities (oftimes, the same business constraints are in the database, the ORM layer and in the calling code. Sheesh.)
  • On Web sites, how many times have you seen a page with relatively simple data needs still issue quite a number of queries to the database because of how ORMs work? We've a reporting system where one simple table generates over 300 separate database queries. In looking at the code (which I confess to writing), there are extremely complicated if/else bits of logic that are there because the domain knowledge is split between the application and the database. Other programmers looking at this code haven't figured a clean way out, but this not uncommon for ORMs.
  • Typically, for complex queries, I see developers write the SQL and laboriously translate it to the ORM code. Regrettably, this translation often involves breaking out multiple queries because while what we often want back is data for reporting, the ORM frequently wants us to assemble a bunch of objects and then extract the data we need -- a useless and expensive intermediate step.

The problem, of course, is how to deal with these issues. One common strategy is to push the domain logic into the database. This is very powerful because it leverages the database's strength of data management and makes it hard for a developer to forget about business rules. Unfortunately, MySQL severely limits our ability to do that. For example, we cannot define custom types, check constraints are not available and triggers are extremely limited.

Another strategy is to develop a solid domain layer which truly represents our business processes. Underneath can be a variety of strategies such as SQL dictionaries, limited ORMs with direct database calls or anything else we think of. The key is to have that domain layer which insulates us from the implementation and gives us the freedom to focus on business needs rather than picayune details of how to write a left outer join with a subselect in our ORM of choice.

The question, at its core, is whether or not ORMs offer the maximum return on investment. While the answer isn't the same for everyone, I get tired of watching projects scale upwards only to discover programmers constantly working around ORM issues.

A large part of the problem, however, may not directly be the ORMs. It may be the affordances they provide. For example, Mason is a great template system. It's very rich. It's very powerful. Every company I've worked with which uses it pushes far too much business logic into their templates. This is because it's a simple and natural thing to do with Mason. Template Toolkit, on the other hand, starts with Perl and passes data to the templates. While you can push too much logic to the template layer, it's less natural and, from what I've seen, less common.

ORMs may be the same. While extremely powerful, they may make it so easy for developers to operate in a certain way that they fail to consider alternatives.


Good

Matts on 2008-06-03T12:57:59

I'm glad to see smart people coming around to these realisations. The simple fact of the matter is that DBI makes ORMs in perl not as valuable as they are in other languages. It's a very feature rich database layer that is very capable of providing you with results in an ORM-like fashion.

Now if hiding SQL is your thing, then fair enough - but I would suggest you (readers of this blog, not Ovid) learn SQL and embrace it, much like you learned regexps and embraced them too. And use good practices (e.g. MVC) to make sure you don't have SQL in your presentation layer.

Re:Good

hex on 2008-06-08T20:50:19

If I ever got mod points here (and I haven't once since signing up) you'd definitely get an Insightful for that comment.

It's the shooter, not the gun

Mutant321 on 2008-06-03T13:08:22

I think you've summed up the issues perfectly. ORMs are extremely useful, powerful tools in the right hands. Unfortunately, not many (myself included) have worked out how to use them effectively.

A typical use of DBIx::Class is as the Model in a Catalyst app. I've begun to think - as you point out - that it's too low level to be the model. The model is business logic, not data access. But I'm not sure DBIx::Class is designed with this in mind, and I don't think I could write that layer myself (and do a good job).

I really like the concept of ORMs, I don't want to have to write SQL for simple stuff, which is often just cut and pasting. It's the right level of abstraction to my mind. But it feels like we need one more generation to arise before they become truly indespensible. (Not to take anything away from those working on DBIx::Class and others...)

Re:It's the shooter, not the gun

chromatic on 2008-06-03T17:58:17

A typical use of DBIx::Class is as the Model in a Catalyst app. I've begun to think - as you point out - that it's too low level to be the model.

This is a major problem with many implementations of MVC. It's not specific to DBIx::Class.

Re:It's the shooter, not the gun

Mutant321 on 2008-06-03T19:12:29

This is a major problem with many implementations of MVC. It's not specific to DBIx::Class.

I wasn't trying to imply there is anything wrong with DBIx::Class specifically. I use it frequently, and am deeply grateful to those who contribute to it.

And yes, the problem lies partly with the MVC frameworks, but also with the ORMs. And I don't pretend to have any idea how to solve it (although I'd be interested to know whether any other languages are doing a better job).

Re:It's the shooter, not the gun

Qiang on 2008-06-04T11:31:33

A typical use of DBIx::Class is as the Model in a Catalyst app. I've begun to think - as you point out - that it's too low level to be the model.
i usually have Controller that calls the model, then calls the data store which is the DBIC part. the business logic lies in the model and i can swap the data store with other ORMs without touching anywhere else. i am pretty happy with this approach. there is a MVC discussion on DBIC mailing list that sums this up. ways to do stuff and why

Right on.

markjugg on 2008-06-03T16:02:48

Thanks for the excellent post, Ovid.

As a professional Perl programmer of 10 years, I have evaluated several ORMs but not adopted one on the grounds that the complexity they add is not worth the benefit.

Instead, I have the learned the DBI API very well, and use it combination with a light-weight SQL generation tool, SQL::Interp. The combination has proven to be efficient to develop and debug, from small scripts to large applications.

Perhaps there are times when an ORM would save our team some moments, but I strongly suspect there are other times where they would cost us hours.

Re:Right on.

spacebat on 2008-06-04T12:03:24

I find DBI a bit painful - its good and all, but leaves me diving into the manual too often.

DBIx::Simple and SQL::Interp together hit the sweet spot for me. Dead simple query generation and result handling, but the DBI connection handle is just a function call away.

Re:Right on.

hex on 2008-06-08T20:54:22

Agreed. DBIx::Simple should always be the first thing shown to people who've never done anything with a database in Perl before. In my case, I've never needed anything else.

If I had a hammer…

siracusa on 2008-06-03T17:18:04

The amount of repetitive, error-prone code saved by an ORM is well worth the cognitive and performance overhead, IME. But like any abstraction, ORMs have their limits, in both directions. Developers have to know when to put down the hammer and pick up a different tool because not everything is a nail. In the case of ORMs, that fact that everything looks so regular and friendly lulls developers into forgetting the real (and usually ugly) database lurking under the covers.

Experience teaches you when "back-solving" from custom-written SQL into the ORM equivalent is not worthwhile, when stored procedures are a better choice than issuing client-side queries, when you must bite the bullet and use your db vendor's crazy proprietary mechanism for a certain task, when you shouldn't use a db at all, and so on. But none of these things is an argument against ORMs in general. The only cure for their over-use in inappropriate situations is education and experience.

Your example is a good one: most of us have learned the hard way the pitfalls of putting unrelated logic in templates. Today, Mason and other templating systems still allow this, but experienced developers have learned the limits of what's wise. The same could be said of Perl itself, where nearly anything is possible, but not everything is advisable.

Why object-relational mappers are appealing

Aristotle on 2008-06-03T21:15:15

Because writing code that generates code sucks. A good ORM will give you an API to compose queries. It’s the difference between eval and using closures, roughly.

I use DBIx::Simple and SQL::Interp to make SQL generation suck less, but it fundamentally can’t be made to not suck at all. And you still need to some sort of OOD for the layer that encapsulates your model, unless you’re content with a random collection of 300 SQL query wrapper methods in a single class.

There are two major wins I see in using a good ORM: being able to represent queries as first-class language entities that can be manipulated (and thus assembled piecemeal by different parts of the code, and munging the data returned from the database to turn columns retrieved via joins into nested data structures. These two concerns give rise to most of the monkey code I end up writing when I go the hand-written SQL generation route. DBIx::Class happens to get closer to the ideal than any other ORM I’ve seen.

the ORM frequently wants us to assemble a bunch of objects and then extract the data we need – a useless and expensive intermediate step.

You know about HashRefInflator, right?

Sweet spot

dennisdjensen on 2008-06-03T22:36:54

I and my colleagues have encountered all of the issues and some more with O/R-mapping and tools as you've described above.

We still haven't found the sweet spot.

Sometimes O/R-mapping is a big winner and a really productivity booster as well as a good abstraction. At other times it just blows up in your face, and a few well chosen SQL-statements with suitable interpolation does it all. Sometimes integrating the O/R-tool with the domain objects works better, at other times it works better when we treat it as a separate layer. The business logic can be anywhere! Sometimes it is most helpful, even necessary, to put all of the business logic (as much as possible) into the database. This requires good knowledge of not only SQL, but relational theory (C.J. Date and friends). Again, many possibilities. At other times it is hardly worth it to put too many constraints upon the data, where application logic and business logic melts together, and any constraint comes from the application itself (minimum databases, in-memory, filesystems, pure blocks of words).

I think Martin Fowler wrote a little about this at some time, but I cannot find it again. He mentioned the different styles a project would go through as it matured or just grew. From simple embedded SQL to full blown expression builders to meta and behind-the-scenes automagic stuff (think gemstone here).

Everything has its price. Like so many things, it is a trade-off.

In my personal, and some of my colleague's, experience (depending very much upon how good at SQL and data theory they are!), O/R-mapping are seldom good at anything but prototyping or a kicker to get started quickly on a new project.

Depends on what you're trying to do

btilly on 2008-06-04T06:36:30

Are you developing a fairly straightforward CRUD application? If so then an ORM does what you need it to do in a clean way and lets you get on with your life. What's not to like?

Are you trying to abstract away your database? Stop right there. I'd lay odds that you're making a mistake of some sort.

Are you trying to do complex reporting? If so then an ORM is the wrong tool. If you're doing complex reporting, you want the full power of SQL. More than that, you want to have access to all of those special features your database supports that may not be widespread. If you're using postgres, their implementation of temp tables is wonderful. If you're using Oracle you get useless temp tables, but you should definitely learn your analytic queries. And so on.

Personally I do a lot of reporting so I don't use an ORM for any of that. But when I wanted to add tracking of who called what report with what parameters and how long it took, I immediately reached for Rose::DB. Which is only used for recording that information.

see the 2006 discussion in the Catalyst list

dami on 2008-06-05T06:21:28

The subject was discussed in 2006 on the Catalyst list : you may find interesting arguments there :
http://lists.scsys.co.uk/pipermail/catalyst/2006-June/008059.html