MySQL Pain of the Day

Ovid on 2008-02-25T17:18:30

The first line of the MySQL 5.1 documentation for subquery optimization:

Development is ongoing, so no optimization tip is reliable for the long term.

They're not kidding, either. Subquery optimization in MySQL is terribly broken, as we've discovered today. We have Matt Trout in for some consulting and even ridiculously stupid subqueries have terrible query plans, forcing us to to consider nasty workarounds, one of which is called "PostgreSQL".


postgres is better

btilly on 2008-02-25T17:53:24

But still not perfect. In the cases where it doesn't do what I want I've found temp tables to be invaluable in getting the query plan that I want. I have even pulled such tricks as creating temp tables, creating indexes on those temp tables, then doing my join.

In any case the bigger underlying problem with MySQL is a fundamental design decision. The problem is that the time to compile a query is part of its run time. MySQL went the route of trying to make compilation fast. That means that it does a bad job of optimization for complex queries. But if you switch the decision and make compilation smart, then lots of existing MySQL code will suddenly slow down, because so few existing MySQL applications take the strategy of compiling the query then re-using it. Making a query engine that does the right thing on complex queries but still compiles simple ones quickly is a fairly difficult problem. (I don't know why nobody has taken the route of letting the query provide a hint that this one is to be compiled very carefully. So the database knows when to optimize aggressively.)

Optimizing subqueries

perrin on 2008-02-26T05:53:47

The MySQL docs state pretty plainly that subqueries have not been optimized at all and won't be until the 6.0 release. In the meantime, you have to pay attention to how you use them. The xaprb.com stuff is a good resource.

In most cases, it works out best to use derived tables (subqueries in the FROM clause), which are implemented with a temporary table and force the evaluation of the subquery before the JOIN order is determined. The only situation where this can be a negative is when you use LIMIT, since those queries are often handled without needing to evaluate the entire subquery.

In situations where you can use a JOIN or LEFT JOIN instead of a correlated subquery, you should. Replacing NOT EXISTS queries with a LEFT JOIN is usually a big improvement.

If you have specific queries you need help with, post them on the MySQL list. I read it and sometimes help people with queries, but usually someone else beats me to it.