Primary Keys

Ovid on 2003-04-29T18:15:18

I've just requested that this be the first line of every new specification we prepare:

Internal database IDs should never be externally visible.

If a customer wants an "ID", I'll be happy to make them one.


interesting

gav on 2003-04-29T19:13:17

How do you work around that? For example, if you were showing a list of things to a user and wanted them to select something, I'd have internal IDs visible. They'd see:
/script/select?id=1
I'm not quite sure what the problem is, they could always edit the url to change the ID but there should be no chance that they can work around permissions this way.

Re:interesting

Ovid on 2003-04-29T19:49:54

Sorry, I was unclear. The id is typically in the URL or in a hidden field and that's fine, but it shouldn't be showing up in a table. It's not information that the user needs or can do anything with, but it can be tiring telling the user that it really doesn't mean anything and "no, you can't change it".

Re:interesting

gav on 2003-04-29T19:58:27

That makes more sense now :)

I worked on a system where the client was paranoid that people were able to see database IDs in hidden fields and change them. I wrote an extra layer that used 8 character random strings as IDs and it was a huge PITA.

GUIDs

Theory on 2003-04-30T01:02:02

I think that we might be using GUIDs for objects in Bricolage 2.0, probably using Data::UUID. It will fascilitate syncing independent Bricolage servers. Of course, database sequence IDs will still be used for primary keys.

--David

Different Generalization

VSarkiss on 2003-04-30T14:49:01

I would generalize that in a different way: no system-generated unique identifier should ever be interpreted as to its content. In other words, those columns should be used for joins to other tables and nothing else. For example, if you ever find yourself writing "order by" on that column, you're setting yourself up for trouble. As a test, if you're using any kind of SUID, I should be able to substitute 1, -47, or 240981 for any of the values consistently, and no program or user should be any the wiser.

Now, I will admit to breaking that rule in one particular situation (you knew that was coming ;-). In most data warehousing applications, you want to capture date values or time values (rarely both) in a dimension table. I've found it's very convenient to have meaningful SUIDs for primary keys on those tables; it makes debugging much simpler when you can look at date_id 20030430 and know it refers to today.