One thing which struck me as odd when I started working here is that we used moderately long strings, called 'pids' as primary keys on many of our tables. These keys are char(8) and encoded as utf8. This makes the key length longer than it needs to be and longer primary keys can cause performance issues (which is why you rarely see UUIDs being used as primary keys).
When I asked about this, I was informed about a "fun" issue with MySQL. Apparently, when using InnoDB (we have MySQL version 5.0.45), AUTO_INCREMENT keys lock the entire table and we were getting serious deadlock issues. Allegedly this is fixed in 5.1. The rationale listed?
[Table level locks are] necessary with the current logical binlogging of MySQL, since the auto-inc values assigned in a single SQL statement must be consecutive.
I don't understand this. From an application standpoint, auto-inc keys should be non-identifying and frankly, it shouldn't matter if they're consecutive, ordered, random, whatever, so long as they're unique. I suspect that this is done for internal architectural reasons because I can't imagine that even the MySQL developers think that it's logically necessary that these keys be consecutive.
The counter-argument I envision is someone telling me that you want to know what order the records were inserted in and gaps will give you an idea of what you might be deleting. If that's the counter-argument, I would suggest that you explicitly plan for this and not rely allow an internal implementation detail. That's encapsulation violation.
::get_auto_increment()
for each row) not something they think is logically required.$i++
to increment my value to some non-consecutive numberRe:Yes it could be better... but a bug?
Ovid on 2008-02-28T14:38:10
Non-identifying: since you aren't sure what these values are, you shouldn't use them as consistently identifying a record. They get used internally to join tables and externally when identifying a particular result in a result set when you need to refer to it again, but not identifying the exact data. Delete a record (on an auto-inc table) and later re-insert it. You should get a different auto-inc primary key, but it's the same data, right? The auto-inc PK is non-identifying (perhaps we're just using the term in different ways).
Consecutive: Start with auto-increment = 1 and insert three rows. Then delete where id = 2. The ids, though ordered, are no longer consecutive. There's nothing wrong with the data and there's no logical requirement for these ids to be ordered.
Bug: OK, you got me on this one. It's not a bug, but it's a severe implementation limitation which is frustrating as hell
:) Re:Yes it could be better... but a bug?
chromatic on 2008-02-28T19:09:59
I wonder if it's for write clustering. Each slave gets a step value by which to increment all auto-increment rows. I don't know if the clustering system guarantees an order of operation of writes though.
There's some interesting stats on performance of char columns as keys here.