Weird Normalization Issues

Ovid on 2003-01-23T23:59:30

Creating a PostGresSQL Database and winding up with some unusual issues and I'm curious who else might have experience with this.

I'm building an inventory management system and we have three tables that I'm trying to figure out how to handle qw(customer country state). A country might have states, it might not. For a given customer's address, if their country does not have states/provinces (Monaco, perhaps?), what do I do? I can't insert a NULL value on a field with a foreign key constraint. I suppose I can create a blank field in the state table and reference that, but then I'm treating this field as "special" and I'm concerned that I'll need to hardcode that ID into my application. How is this usually handled where either one needs to have an FK constraint or NULL?

I've also realized that for a given customer, there's no way to stop the database from having a state of "California" and a country of "France." The interface doesn't allow this to occur, but that doesn't guarantee that future bugs won't occur. I suspect a mistake in my database design, but for the life of me, I can't figure out what I might have done wrong. Or it's possible that this is simply a situation that has to be handled on the application level :(


Foreign Keys

runrig on 2003-01-24T01:07:45

Depending on how the constraint is set up in the create table clause, foreign keys can be null. They need to exist in the other table if they are non-null. See the create table statement in the PostgreSQL docs[postgresql.org].

Re:Foreign Keys

Ovid on 2003-01-24T01:13:34

From what I can figure out, I apparently had NOT NULL on my column definition. That threw me off.

What I'm probably going to do is pull the reference to the country out of the customer table and use the state to look it up. However, this means we can't have NULL states, so I'll have to create "blank" states that refer back to the country. It's ugly, but this seems to be the cleanest way.

Re:Foreign Keys

runrig on 2003-01-24T01:22:44

Looking again at the docs, 'MATCH PARTIAL' is not implemented yet, so you still have your California/France problem. Otherwise you would be able to have one constraint on country and another on a country/state combination.

So broken but ...

autarch on 2003-01-24T01:21:42

Postgres, like all SQL DBMS's, is very broken, but here's the "perfect world" version (where Postgres is actually a real relational DB. haha). You can probably extrapolate to an approximate real world version from here ;)
Customer
  -----------
  customer_id   integer,
  name          text,
  country_id    integer,
  state_id      integer,
Actually, those integers should be separate data types, but Postgres doesn't really support that. Moving on:
Country
  -----------
  country_id    integer,
  name          text
 
  State
  -----------
  state_id      integer,
  name          text
 
  ValidCountryState
  -----------
  country_id    integer,
  state_id      integer
Now let's assume that you have a state entry named "" (0-length string). This will be the valid state for all countries without states.

Then you need a constraint along the lines of
EXISTS
( SELECT country_id
  FROM ValidCountryState AS VCS, Customer AS C
  WHERE VCS.country_id = C.country_id
    AND VCS.state_id = C.state_id )
Of course, Postgres doesn't actually support database-wide constraints. I think you may be able to fake this with a trigger or something, or you could (ick) check it programatically before inserts and updates.

This isn't too terrible a solution, and it doesn't even require NULLs (which are at best avoided, and at worst a huge big disaster).

BTW, This doesn't have anything to do with normalization. "Normalization" is not the same thing as "integrity", though a normalized database is a good first step towards ensuring data integrity.