SQL Tidy

Ovid on 2008-05-07T08:46:06

It seems tough to find a well-written SQL formatter out there which is both accurate and open-source. I looked at SQL::Tidy as an inspiration, but it's tokenizer is bad and there are plenty of issues with the code. A quick bit of hacking has gotten it to the point where it will take this:

SELECT me.import_id, me.timestamp, DATE_FORMAT ( timestamp, '%Y-%m-%d' ) as day
FROM import me LEFT JOIN import_error import_errors ON ( me.import_id =
import_errors.import_id AND import_errors.type IN (
'X::IMPORTER::VALIDATION::BRANDTITLEMISSING')) WHERE ( ( ( timestamp >=
'2008-05-01T00:00:00' ) AND ( timestamp <= '2008-05-06T13:12:30' ))
) GROUP BY me.import_id

And turn it into this:

 SELECT me.import_id, me.timestamp, DATE_FORMAT ( timestamp, '%Y-%m-%d' )
           as day
      FROM import me LEFT
      JOIN import_error import_errors
      ON ( me.import_id = import_errors.import_id
               AND import_errors.type
               IN (
                        'X::IMPORTER::VALIDATION::BRANDTITLEMISSING'
               )
      )
 WHERE ( ( ( timestamp >= '2008-05-01T00:00:00' )
               AND ( timestamp <= '2008-05-06T13:12:30' )
             )
           ) GROUP BY me.import_id

That's pretty ugly, but at least it makes it a lot easier to read auto-generated SQL that's all on one line. Still, it's only for quick cleanup. The tokenizer needs a lot of work before we can even think of a reasonable reformatter.

Amusingly, though, I accidentally ran Perl::Tidy over it.

SELECT me . import_id, me . timestamp,
    DATE_FORMAT(timestamp, '%Y-%m-%d')
    as day FROM import me LEFT JOIN import_error import_errors ON(me
        . import_id
        = import_errors
        . import_id AND import_errors
        . type IN('X::IMPORTER::VALIDATION::BRANDTITLEMISSING')) WHERE((
    (timestamp >= '2008-05-01T00:00:00')
    AND(timestamp <= '2008-05-06T13:12:30'))
        ) GROUP BY me
    . import_id

OK, not great, but not significantly worse than the SQL::Tidy.


param binding

niceperl on 2008-05-07T20:07:40

Perhaps you get a better result avoiding param inyection in you SQL, and use binding (a best practice).... just a joke ;)