For a few years now I've been a huge fan of Postgresql. It always worked the way I wanted it to, and when they added MVCC it just got even better.
However this week I've had some rather extreme performance requirements from Pg. So I've been doing as much as I possibly can in plpgsql (Pg's variant on Oracle's Pl/sql). In fact I've got it to the point now where I can just pass in a chunk of text to Pg, and get back the result I need, all without doing any prepare/execute/loop stuff whatsoever.
However, the stuff I've been doing involves temporary tables. Each time in I need to stuff the data I've got into this temporary table, do some manipulation on it, aggregate some results, and return a single floating point number. Easy enough, and pretty standard fodder for anyone who's got a strong background in RDBMS' like I do.
Well no, it's not quite that simple. You see I was struggling with a very strange error message for quite some time - the symptoms being that everything worked perfectly the first time through, but failed horribly the second time. Eventually I turned to google, which came up with the goods: in plpgsql when you create and use a temporary table, it compiles the table name in all your sql into a direct tableid reference for speed, but since next time around that temporary table is different, it's pointing at a non-existant table. Hence the error. This is a bug in Postgresql. Argh!
So what are my options? I've tried a few things now - Interbase/Firebird doesn't yet have temporary tables. MySQL doesn't have stored procedures. SAPDB was just a bitch to install and I couldn't figure it out.
What's left? It seems that Oracle or one of the other commercial db's is all there is :-(((
Get the bug fixed. This is the advantage of using open source stuff: once developers know about a bug and have a good clue about what causes it, you often don't have to wait long for a fix.This is a bug in Postgresql. Argh!
So what are my options?
Re:Couldn't you...
Matts on 2002-08-25T20:22:22
That's the work around that I've done for now. Unfortunately it's a fair bit slower than the temporary table system, because it has to have an extra column (pid), which has to be permanently indexed, thus inserts are slower, and selects and updates are slower because they have to use the index. Nuts.Re:Couldn't you...
jordan on 2002-08-26T01:12:42
It's been a few years since I did that app where I replaced a temporary table with a permanent one and I recall that there were other advantages to making the data (semi-)permanent.
I recall now that the real speedup was that the data was time-tagged. I had been performing big selects against the database for data, inserting data from a time range in the temporary table and performing reports off of this. I was able to take advantage of the fact that subsequent runs would need data mostly in the same time range and I could perform a smaller select off of the big database on subsequent reports if I kept data from the last pass around.
It was a multi-processor system with lots of disks and I found I could delete the data I didn't need from the "working" tables at the same time I was performing the select off of the big databases and inserting this data into the "working" tables quite a bit faster than I could do those two operations (delete from "working" table, insert into "working" table from big databases) serially. This overlapping reduced wall-clock time of the operation significantly.
PlPGSQL isn't the only embedded language. There's PL/TCL, PL/Python and PL/Perl as well. These may not have the same limitations as PL/PGSQL.
-Dom
Re:Alternative Language
Dom2 on 2002-08-25T19:44:59
Hmmm, looking at the docs, you'd have to use the CPAN module DBD::PgSPI to do stuff. Might be worth examining that module...
-Dom