Undef, NULL, and Other False Friends

Ovid on 2008-02-22T10:31:50

So you have pulled an employee's record from the database and you try to print their salary:

Use of uninitialized value in print

Hmm, what does that mean? A naïve conclusion would be that you don't know their salary. There are, however, multiple possibilities:

  1. Employee is on an hourly wage, not a salary.
  2. Employee's salary is unknown.
  3. Employee is actually an unpaid intern or volunteer.
  4. Employee is actually a contractor paid by someone else.

Some of those issues might be data modeling issues, but a simple undef value might hide a lot of information. In the database, there's a good chance that the NULL value will be inserted in the 'salary' field and you'd find an 'hourly' field elsewhere in the same table, but that can lead to data integrity problems: what happens if both are mistakenly filled in? Is it legal for both to be NULL? Then you still don't know which of conditions 2 through 4 above are relevant, but they might not be relevant in your business rules/data model.

The underlying problem is that the absense of information can be problematic. For example, we take in XML from an external supplier which updates our data. If a particular element is missing, we presume that the corresponding data is to be deleted. We now need to add data which is not represented in that XML. When the supplier sends an update, they can't represent that data, but by our rules, that means the data is to be deleted. This absence of data means one thing in one context and something else entirely in another, but it's something we rely upon all too often. Absence of evidence is not blah, blah, blah ...