What is the longest Perl test suite you've ever seen?

Alias on 2008-10-14T00:41:35

ok( Perl::Dist::Strawberry->default_machine->run );

With this release of Strawberry, I've managed to find a reasonably safe way to write tests that generate distributions without wiping out your current installation.

One of these tests the default Perl::Dist::Machine for Strawberry.

A "distribution machine" takes the options for building a distribution and generates several variants of the same thing. In this case, the machine is handling the creation of the three 5.8.8, 5.10.0 and Portable variations of Strawberry.

For completeness, this process now has a test script of it's own.

The test takes around 4-5 hours to complete (each distribution takes about an hour and a half to build) (on top of the 4-5 hours for the three tests that do the single builds on their own).

It's so long that it's now pretty much pointless to run under $ENV{AUTOMATED_TESTING} because it's going to hammer the CPAN Testers hosts (it also generates around half a gig or so of temporary data).

Currently these distribution building tests only run under RELEASE_TESTING.

So my question to you...

Am I right in thinking this is the longest test on the CPAN?

And secondly, what's the longest (Perl) test suite you've ever seen?

For me, it's my current test suite at work, which is starting to head past an hour now.


Longest Suite

Ovid on 2008-10-14T07:05:21

The longest suite I've worked with took about an hour and a half to complete. Regrettably, it was written with a bespoke (and broken) test harness.

Why does your current test suite take so long to run? There are numerous strategies which can generally be applied to mitigate long test suite run times (or are you not bothered by this?).

Re:Longest Suite

Alias on 2008-10-14T12:23:57

In Perl::Dist(::Strawberry) I build Perl from scratch, about 7 times.

Across all of them, I also install about 200+ CPAN modules...

And unfortunately, none of it knows when to test in parallel on it's own, so much of my the power of my quad-core machine is wasted.

At work, much of the extra work is necessary, because the application is ENORMOUS (250k source lines of code) with lots of lots of potential variation and state.

Before every single test script, we kill and rebuilt users from scratch. Some test directories, this also means uploading PDF files, generating all sorts of stuff, running back-office sweeps etc.

If we aggregated absolutely everything like you do, we'd probably only save 5 minutes out of that hour, because much of the work is in the web client to server round tripping, and in the database queries, and in the massive amount of setup tasks which we need to redo each test script.

It gets worse next year, when I start to borg in some external nightly and weekly maintenance code into the test suite as well.

Updating the external Endecca instance (specialised search engine vendor product) with a new baseline update is a process which takes almost an hour on it's own, much of that due to dumping out search data from the 100 gig Oracle instance.

A full resync from our upstream ERP takes another three quarters of an hour, potentially.

I think everyone here recognises that there's absolutely no way that we can ever be in a position to have developers running the whole test suite regularly once the level of coverage starts to get up to somewhere we are happy with. Or even run it all pre/post commit.

And so that takes us over into the range of "nightly test run" scenarios, and running the test suite when building the installation RPMs to feed into the UAT and deployment workflows.

And so as long as we can contain the entire test run within about 5-10 hours max, we can continue to get nightly runs quite comfortably.

At the moment, we're a lot more concerned about issues like rollback strategies, and horrors like load testing, which is insanely hard to do (right) when you hit this complexity. Reproducing the caching behaviours of production load alone is bloody hard.

Re:Longest Suite

jplindstrom on 2008-10-14T12:43:50

For the long setup time, have you considered creating a set of general test data that can be used for many of the tests?

If you can do that, then maybe you can cache the setup of it and avoid long processing time just to get to the correct state.

So the app code is used to move the app to the correct state only once per test suite run, or even outside of the test run. Once that is done, make a snapshot of it that can be restored quicker for each test.

This is something we haven't done yet, but I'm thinking of ways to do it. In our case, all of the state would be captured by a mysqldump.

And growing...

mpeters on 2008-10-14T14:05:10

Smolder tells me that the test suite for our biggest project at $work took 1:28:55 the last time it ran. Granted our smoke box isn't as powerful as our production or even our development machines. But it's still in active development with lots of new features planned, so I expect that to get worse...

CPAN::Reporter!

srezic on 2008-10-14T19:11:54

I had the impression that CPAN::Reporter is usually the slowest distribution, and really, if I run the following one-liner in a directory with my saved reports:
grep -r 'wallclock secs' .|perl -nle '/^(\S+).*=\s(\d+\.\d+)\s+CPU/ and push @d, [$1,$2]; END { print join("\n", map { "$_->[1]\t$_->[0]" }sort { $b->[1]$a->[1] } @d) }'
then this shows up as the top 5, sorted by CPU seconds:
1132.65 ctr1/done/pass.CPAN-Reporter-1.1702.i386-freebsd.6.1-release-p23.1223500100.952. rpt:Files=36,
932.16 ctr2/done/pass.CPAN-Reporter-1.1702.i386-freebsd.6.1-release-p23.1223506142.7634 .rpt:Files=36,
512.38 ctr2/done/pass.v6-0.032.i386-freebsd.6.1-release-p23.1223899878.68986.rpt:Files= 136,
498.25 ctr1/pass.v6-0.032.i386-freebsd.6.1-release-p23.1223936252.51988.rpt:Files=136,
425.56 ctr1/done/pass.PAR-Packer-0.982.i386-freebsd.6.1-release-p23.1223636690.68184.rp t:Files=4,
314.49 ctr2/done/pass.PAR-Packer-0.982.i386-freebsd.6.1-release-p23.1223648139.52455.rp t:Files=4,

Whether CPAN::Reporter is slow or not seems to depend on the number of entries in PERL5LIB. In the above both cases I had many (100?) entries there. If no PERL5LIB is set, then CPAN::Reporter takes only some 130 CPU seconds on my machine.

One can also use the ctgetreports scripts from the CPAN::Testers::ParseReport distribution to see all the tester's timings, by using this oneliner:
ctgetreports CPAN-Reporter --q 'qr: (\d+\.\d+) CPU' --q 'conf:osname' --q 'conf:archname' | & perl -nle '/CPU.(\d+)/ and push @d, [$1, $_]; END { print join "\n", map { $_->[1] } sort { $b->[0]$a->[0] } @d }'
The slowest one is this:
PASS 2421051 qr: (\d+\.\d+) CPU[5322.27] conf:osname[linux] conf:archname[mips-linux-gnu-thread-multi]
and the fasters this one:
PASS 2385291 qr: (\d+\.\d+) CPU[86.70] conf:osname[linux] conf:archname[i686-linux]

Unfortunately it seems that there are no meaningful CPU timings on Windows systems.