It seems tough to find a well-written SQL formatter out there which is both accurate and open-source. I looked at SQL::Tidy as an inspiration, but it's tokenizer is bad and there are plenty of issues with the code. A quick bit of hacking has gotten it to the point where it will take this:
SELECT me.import_id, me.timestamp, DATE_FORMAT ( timestamp, '%Y-%m-%d' ) as day FROM import me LEFT JOIN import_error import_errors ON ( me.import_id = import_errors.import_id AND import_errors.type IN ( 'X::IMPORTER::VALIDATION::BRANDTITLEMISSING')) WHERE ( ( ( timestamp >= '2008-05-01T00:00:00' ) AND ( timestamp <= '2008-05-06T13:12:30' )) ) GROUP BY me.import_id
And turn it into this:
SELECT me.import_id, me.timestamp, DATE_FORMAT ( timestamp, '%Y-%m-%d' ) as day FROM import me LEFT JOIN import_error import_errors ON ( me.import_id = import_errors.import_id AND import_errors.type IN ( 'X::IMPORTER::VALIDATION::BRANDTITLEMISSING' ) ) WHERE ( ( ( timestamp >= '2008-05-01T00:00:00' ) AND ( timestamp <= '2008-05-06T13:12:30' ) ) ) GROUP BY me.import_id
That's pretty ugly, but at least it makes it a lot easier to read auto-generated SQL that's all on one line. Still, it's only for quick cleanup. The tokenizer needs a lot of work before we can even think of a reasonable reformatter.
Amusingly, though, I accidentally ran Perl::Tidy over it.
SELECT me . import_id, me . timestamp, DATE_FORMAT(timestamp, '%Y-%m-%d') as day FROM import me LEFT JOIN import_error import_errors ON(me . import_id = import_errors . import_id AND import_errors . type IN('X::IMPORTER::VALIDATION::BRANDTITLEMISSING')) WHERE(( (timestamp >= '2008-05-01T00:00:00') AND(timestamp <= '2008-05-06T13:12:30')) ) GROUP BY me . import_id
OK, not great, but not significantly worse than the SQL::Tidy.