Anyone who has worked with databases long enough knows that the relational model does not map well to the real world. Further, the necessity of database normalization is not something inherent to the data; it's mostly (not entirely) an artifact of the relational model. For example, how do you store customer addresses? Two database might store it completely different ways yet both be normalized. The common example is to do something like this pseudo-code:
create table customer { id ... first_name ... address ... city_id ... state_id ... etc ... )
Now let's say that you're having a sale and you want to send fliers to all of your customers saying "We're having a sale. We hate you and we don't want you to show up." Getting their addresses is easy.
Now business two is also having a sale but they want to send out two sets of fliers. The first set tells their customers that they hate them and don't want them to come and the second set is to all addresses for which they don't have a customer. Now they have problems if they're using the above table. They have to have a separate addresses table and the customer table looks like this:
create table customer { id ... first_name ... address_id ... etc ...
The problem here is that business knowledge is required to properly know how to normalize the database. For the second business, the address is not dependent on the customer, unlike the first business. In other words, the database reflects domain knowledge but you, the poor fool having to maintain it, can't tell from looking at a database schema if it's really reflecting business knowledge or if it's a mistake.
This is sort of silly. For the first business the addresses exist in the real world whether or not they have customers there. Further, their database is junk if a customer has a summer home and a winter home. Even then they might be facing performance problems and have gone with a deliberate decision to denormalize parts of the database. Despite what "purists" argue, sometimes this has to be done (my last company had an Oracle performance expert come in and finally admit that he couldn't speed up their billion row table which relied on aggregated custom data types).
Object databases and multivalue databases can get around many of these problems by allowing data to be arranged in a heirarchical fashion instead of a relational one (though they may use a relational database under the hood). By doing this, if properly designed, they trade some performance for better real world modeling. But there's still a problem. Consider a table where we store what a customer likes (more pseudo-code).
create table customers ( id ... customer_name ... likes (multivalue, fk likes.id) ... ) create table likes ( id ... thing ... )
With that sort of multivalue relationship, we can fetch everything John likes very easily:
SELECT @likes FROM customers WHERE customer_name = 'John'
We can also find out everyone who happens to like something John likes:
SELECT @likes FROM customers WHERE customer_name != 'John' AND @likes IN ( SELECT @likes FROM customers WHERE customer_name = 'John' )
So far, so good. We've gotten a decent amount of power from violating first normal form. However, there's still a problem. These databases (from what I've seen) usually constrain relations on type, not capabilities. How do we say that John "likes" another customer? There's a very good chance we can't do that. The likes table might have a list of customer ids or it might have a list of product ids, sport ids, etc. Because these technologies seem to restrict things tightly based upon types (and it's tough to introduce an FK constraint on multiple tables on the fly), we still can't model the real world terribly well.
I suppose one could actually get something like this to work by reaching into the system tables or something, but it would be difficult. Further, if we tried to do something like this, I suspect that the slow object or multivalue databases would be even slower. Moving one step closer to reality is one step further from an application that responds.
I am stuck with an Oracle database that I can't change, has spaces filling up every table regardless of what is actually in the table and everything is a CHAR.
Re:I wouldn't know...
sigzero on 2005-10-21T02:55:55
Oh, I forgot to meantion the good part. I am really good at using "to_date" and "rtrim".
I forgot to mention this though I don't know how I forgot.
All tables (all 1500 of them); every field is NOT NULL. Every field.
Anyone who has worked with software long enough knows that the relational model does not map well to the real world.
Then pick any of the following:
s/relational/procedural/
s/relational/object oriented/
s/relational/functional/
s/relation/logical/
s/relation/aspect oriented/
s/relational/hierarchical/
Continuing this list is left as an exercise for the reader
Models are not the real world. They're an abstraction chosen to make a particular problem domain easier to manipulate. Sometimes people pick the wrong model. Sometimes the problem domain changes.
There's nothing that's particularly relational about the issue about whether addresses are separate "things" from customers. I've seen exactly the same issue with OO code.
In other words, the database reflects domain knowledge but you, the poor fool having to maintain it, can't tell from looking at a database schema if it's really reflecting business knowledge or if it's a mistake.
And I can't tell just by looking at a object class hierarchy whether a particularly odd looking arrangement of classes is a reflection of business knowledge or if it's a mistake.
Being normalised doesn't say anything about the fitness of a relational model for a particular problem domain any more than the Law of Demeter says anything about the fitness of an OO model for a particular problem domain.
Relational models that are normalised have nice properties that can be useful. OO models that follow the Law of Demeter have nice properties that can be useful. They're talking about how good the model is. I don't think it's really fair to ask them to say things about how well a model fits a problem domain too!
Which of course would be a wonderful thing to know.... wish I knew how to figure it out...
Object databases and multivalue databases can get around many of these problems by allowing data to be arranged in a heirarchical fashion instead of a relational one
I'm curious why you think heirarchies and relation models are contradictory (nested set tree model and all that...) ?