Last night I played with three databases. O'Reilly subscribes to Bookscan, and we get a spreadsheet every week showing the last week's sell-through (actual sales in stores) and the year-to-date sell-through for umpty-zillion technical books.
Yes, a spreadsheet. Feh and bah to spreadsheets. I saved as CSV and used Text::CSV_XS to write a quick search routine (give me all the books with "XML", "Schema", or "XSLT" in the title; give me all the O'Reilly books; etc.). That was database 1.
Then I wrote a quick screenscraper for the O'Reilly intranet sales page, to fetch the year-to-date sales from the web interface to our data. That was database 2.
Then I broke out my command-line client for the XML-RPC interface to our Amazon ranks database. Database 3.
I found out that Amazon is far more geeky than the Barnes and Noble-type stores that Bookscan reports on. AOL for Dummies sells really big through real stores, phenomenally poorly on Amazon. (Not surprising, really--once you're on the internet, you probably don't have much use for AOL for Dummies).
Then I compared our sales to bookscan sales to see whether it was possible to make any sense of the two. It wasn't. In some cases, sell-through was more than 100% of the orders our warehouse received. All that means is that the some of the chains had stock, sold it, and haven't reordered from us. Probably because they had too much stock to begin with.
All this from Perl. I'd have been boned trying to do it in Excel or any (spit) application. Proof that every company's financial department should have at least one Perl programmer.
Interestingly, the HTML tables returned by the intranet database interface were so awful that HTML::TableContentParser horked. I had to resort to regular expressions. Interestingly enough, the code became shorter when I did that, and easier to read. How odd ...
Another dead-end was using DBD::CSV. Because I wanted to full-text search the record, regardless of field, constructing SQL queries was too painful and slow. And I couldn't get RLIKE to behave. So Text::CSV_XS was perfect.
--Nat
(all this took about two hours, half of which was spent debugging the HTML scraping and bumbling around with DBD::CSV)
Re:Right-O
Fletch on 2003-05-27T17:39:05
Yes, but then you risk putting the guy who gets the specifications from the customer and brings them to the programmers out of work.</Office Space>
It'd be an interesting freelance service, but then 90% of the demand would be on Wintendo and probably more tedium than fun.
You might also find Spreadsheet::ParseExcel and Spreadsheet::ParseExcel::Simple useful.
John.
--
SQLite never fails to amaze me with the high degree of features and performance coupled with vastly easy use. Few "throw-away" projects would have me dump data into a "real" db, even one I have running on the system, but I do it with SQLite regularly.