As is usual for me, slow on the uptake, I completely missed the discussion and developments on this site about SQLite.. Having made the discovery yesterday, I downloaded, took home and tried to see how it stacked up against MySQL..
The data volume is moderately large, 13 million lines worth of inserts (thats roughly 4-5 million records) distributed among 3 tables. I have it in a 1GB text file.
I was originally going to use MySQL to store it (Oracle installations on my home machine, I dont think I need).. Looked around for a PostgreSQL Win32 install that I could use without Cygwin pain (didnt find it, sadly), so its a straight toss between SQLite (curiosity) and MySQL. (dont get me started on Access). The downers first: Its not a particularly high spec machine. In particular, I only have 256mb memory. The second is that the hard disk, although a large beast, is also slow.. (why o why did I turn down an offer for a 7.2k RPM disk ?)
I needed to massage the data a bit (pulled it off an Oracle db, so it had a to_date conversion which SQLite barfs on), so did that, created the tables, and left the SQL to run..
An hour later, I became impatient, broke off the import and queried, only to find that one table had a measly 300k records. The other tables were empty. Turns out (reading the speed test) that things are much much faster if you enclose all the inserts in a BEGIN transaction.
Now, what I need to do is figure out how to edit a 1GB file and embed BEGIN blocks before commits (there are commits every few thousand records).
What am I hoping to do with this ? err. just see how SQLite handles large volumes.. and then I can get away with not installing MySQL :) *wonders aloud if DBD::SQLite will work on Win32 ? dont think so*
perl -pi.bak -e 's/^(COMMIT.*)/$1\nBEGIN;/' file.sql
-Dom
Re:Perl Is Your Editor
tinman on 2003-05-10T14:54:08
Wheee!
:) exactly that, but I did it inside a script. Small corrections, though.. its a case insensitive match, and if I dont match for a semicolon immediately after commit, I get matches for data ("committee") as well.. I need to read perlopt again to decipher this one..
Re:Perl Is Your Editor
Dom2 on 2003-05-10T19:31:29
Those of us brought up on sed and awk have an advantage here. Generally, -p and -n are the most useful flags, as well as -i and (of course) -e, which is the actual code.If you've a perl journal subscription, there's a good article about command line perl in this months issue.
-Dom
Re:SQLite Win32 PPM
tinman on 2003-05-10T14:55:26
Ooh, neat, thanks.. should I have read the ActiveState documents on alternative repositories ? I had this in my bookmarks, didnt think to look, though..