Major CPAN::Mini::Extract upgrade

Alias on 2008-05-05T05:21:08

One of the things I noticed at the Oslo Hackfest was the surprising number of people using CPAN::Mini::Extract to selectively suck out bits of the CPAN (All the Changes files, all the Makefile.PL, all the tests, etc etc) and scan them for stuff.

Of course given that it was a meeting of most of the CPAN QA people, it was pretty much the entire userbase of CPAN::Mini::Extract as well :)

And with offline support now finally working properly in CPAN::Mini I can now make offline support for CPAN::Mini::Extract work as well, which (unlike offline support in CPAN::Mini) is actually useful.

Since I find that most of my random inspirational hacking happens when I'm offline, this will be hugely useful for me, if nothing else.

At the same time I took another look at the speed of the extraction, since I noticed memory usage was all over the place when extracting. From what I can tell, the extraction in IO::Zlib happens half a meg at a time, but continuously allocates and frees 50meg of memory for each block it expands. Weird.

By shifting expansion to a one-shot extraction to a temp file, and then opening tarballs once from the temp file, I managed to get a two to three times speed up for file extraction.

Combined with CPAN::Mini pipelining, this makes CPAN::Mini::Extract massively faster (a 200%-300% overall speed up).


And while you're at it...

Ron Savage on 2008-05-05T06:05:56

Here's an incomplete frequency analysis:
tar.gz => 14871.
tgz => 240.
zip => 109.

So, I'd like the hard-coded *.tar.gz (in 2 places) to include *.tgz, at least :-).

TIA.

Re:And while you're at it...

Alias on 2008-05-05T06:17:09

Done, 1.19 uploaded

Re:And while you're at it...

Ron Savage on 2008-05-05T07:40:19

$many x $thanx;

Re:And while you're at it...

Alias on 2008-05-06T03:51:33

You know you have that backwards right?

Re:And while you're at it...

Ron Savage on 2008-05-07T02:07:06

Hmm. You're right. But wait! My target audience speaks English, not Perl, so - phew - that's ok...