Last guy off the blocks...

tinman on 2003-05-09T07:30:22

As is usual for me, slow on the uptake, I completely missed the discussion and developments on this site about SQLite.. Having made the discovery yesterday, I downloaded, took home and tried to see how it stacked up against MySQL..

The data volume is moderately large, 13 million lines worth of inserts (thats roughly 4-5 million records) distributed among 3 tables. I have it in a 1GB text file.

I was originally going to use MySQL to store it (Oracle installations on my home machine, I dont think I need).. Looked around for a PostgreSQL Win32 install that I could use without Cygwin pain (didnt find it, sadly), so its a straight toss between SQLite (curiosity) and MySQL. (dont get me started on Access). The downers first: Its not a particularly high spec machine. In particular, I only have 256mb memory. The second is that the hard disk, although a large beast, is also slow.. (why o why did I turn down an offer for a 7.2k RPM disk ?)

I needed to massage the data a bit (pulled it off an Oracle db, so it had a to_date conversion which SQLite barfs on), so did that, created the tables, and left the SQL to run..

An hour later, I became impatient, broke off the import and queried, only to find that one table had a measly 300k records. The other tables were empty. Turns out (reading the speed test) that things are much much faster if you enclose all the inserts in a BEGIN transaction.

Now, what I need to do is figure out how to edit a 1GB file and embed BEGIN blocks before commits (there are commits every few thousand records).

What am I hoping to do with this ? err. just see how SQLite handles large volumes.. and then I can get away with not installing MySQL :) *wonders aloud if DBD::SQLite will work on Win32 ? dont think so*


Perl Is Your Editor

Dom2 on 2003-05-09T08:02:35

Off the top of my head, how about something like:
perl -pi.bak -e 's/^(COMMIT.*)/$1\nBEGIN;/' file.sql

-Dom

Re:Perl Is Your Editor

tinman on 2003-05-10T14:54:08

Wheee! :) exactly that, but I did it inside a script. Small corrections, though.. its a case insensitive match, and if I dont match for a semicolon immediately after commit, I get matches for data ("committee") as well..

I need to read perlopt again to decipher this one..

Re:Perl Is Your Editor

Dom2 on 2003-05-10T19:31:29

Those of us brought up on sed and awk have an advantage here. Generally, -p and -n are the most useful flags, as well as -i and (of course) -e, which is the actual code.

If you've a perl journal subscription, there's a good article about command line perl in this months issue.

-Dom

SQLite Win32 PPM

lachoy on 2003-05-09T11:17:15

See this repository.

Re:SQLite Win32 PPM

tinman on 2003-05-10T14:55:26

Ooh, neat, thanks.. should I have read the ActiveState documents on alternative repositories ? I had this in my bookmarks, didnt think to look, though..