SQL wisdom: don't mix datatypes in joins

jplindstrom on 2005-04-01T12:02:55

Sybase allows a foreign key between two tables to be of almost-the-same datatype. So table a's PK can be NUMERIC(10,0) and table b's column a_id can be INT.

While that may sound practical in certain circumstances, it's a potential performance killer when doing joins on those tables because of the implicit casting.

We just discovered that this was the cause of a major slowdown in a frequent query. Fixing it, the query runtime went from 1.7s to 13ms. Yay!

For comparison, Oracle requires exactly the same datatype on both sides of a FK relationship. At first I thought this was overly fascist, but now it seems like a good thing.

So since Sybase allows potentially stupid mistakes like this I'm inclined to implement a sanity test for this in our build tool so it doesn't happen again. That and code reviews, at least for code from external consultants.

I miss 'domains' in RDBMs

htoug on 2005-04-01T14:27:21

Back in the real good old days when I started to work with RDBMses ('86 to be precise) I cut my teeth on Digitals RDB database (later bought by Oracle as part of the great DEC sell-out).

It supported domains, ie named datatypes. As part of the database design you would create a domain for every distinct datatype. Your columns would not be of type 'int' or 'numeric(20,2)' but 'a_identifier' or 'monthly_payment'.

Joins and other comparisons were nearly always between columns with the same domain - if not someone would wonder why.

It also helped to keep things sane, by keeping column validation code common for all columns with the same domain - eg if 'a-id' is to be an even number then you are almost certain to forget it somewhere if you have to have the constraint on every column and not in a central place - especially when you later have to alter it.

We were so happy with the idea that when we later made a system using Ingres we implemented an active datadictionary on top of the Ingres metadata just to be able to use domains. All changes are made in the datadictionary and the 'rolled onto' the database(s) using a perl-script.