Test suites need to be treated carefully. There are three golden rules for them:
Items #1 and #2 are generally taken as a given. Item #3, however, is equally important and frequently ignored.
When I first started at the BBC, our team had over 9,000 tests, coverage in the mid 80s (percentile) and all tests passed. Rarely have I had the luxury of stepping into such an environment. However, the test suite was very noisy. Lots of diagnostic output, Perl warnings, and a user-generated warnings about some database tables being changed when they shouldn't be. These are code smells. Maybe nothing is really wrong, but how can we know without looking into things?
The diagnostic output was easy to fix. Much of this was in the form of "explanations" to programmers running tests. By reading these comments in verbose test output, you can get a better sense of what the tests are doing. I argued that these comments should be pushed into tests, but jplindstrom pointed out that many of these comments weren't really appropriate for tests as they explained the intent of subsequent tests. As a result, when we switched from Test::More to Test::Most, we started using the explain function. This is a special form of &diag which skips the call to &diag unless the tests are being run in verbose mode.
Some people would argue that extra diagnostics aren't a code smell, but when you have too many diagnostics, you can get used to seeing non-test information and it's easier to overlook real problems, such as warnings.
I went through and fixed a lot of the warnings (most were harmless), but never had a chance to get all of them. As a result, the "Use of uninitialized value in string" warnings thrown by our database migration tests were ignored.
Recently I wrote some database migration code and ran those tests directly. All tests passed, but I decided to fix the warning. The basic structure of the code looked like this:
my ($dbname, $host, $port) = ($1, $2, $3); my $dump_cmd; { no warnings 'uninitialized'; $dump_cmd = "mysqldump -h $host -P $port -u $user -p$pass --no-data $dbname" . ' | grep -v Database | grep -v Server'; } my $migration = DBIx::Migration->new({ dsn => $dsn, dir => $dir, username => $user, password => $pass, connect_options => $db_conf->connect_options, }); $dsn =~ m/dbi:mysql:dbname=(.*);host=(.*);port=(.*)/;
How the hell did the regex for those back references wind up being after assignments from $1, $2 and $3? As it turns out, this happened a long time ago. Our tests were passing because we were using mysqldump to generate diffs, but because the arguments were undef, mysqldump was returning an error message which the tests always successfully diffed against itself. Fixing this bug revealed that the tests should have failed many months ago. We now have some cruft in our database as a result. It's not too bad and doesn't affect the integrity, but it does mean that if we ever had to migrate the database back down, our code would have broken.
Today I'm going to examine the warnings about the database tables being changed when they shouldn't be. I have less confidence in this test suite than I did. There probably isn't a problem here, but when a test suite is noisy, that's a serious smell which should be examined.
Re:stdout v stderr
Ovid on 2008-01-29T15:05:44
I tend to use "classic" Test... Do you mean the ancient Test.pm? If so, you're missing out on all of the improvements made in the testing world over the years. Test.pm should only be there for legacy support.
If you must use that interface, at least consider using Schwern's Test::Legacy module. This will allow you to keep that interface but still use other testing modules. Of course, that changes your diagnostic output and breaks what you've described. However, diagnostics are a small, small price to pay for using the rest of Perl's test modules
:) Re:stdout v stderr
jmason on 2008-01-29T15:31:54
yeah, I do mean that ancient module. and I know there are many goodies available if I was to catch up;)
In SpamAssassin, we have a project philosophy of attempting to keep our CPAN dependencies at a minimum, and we support perl 5.6.1 as a platform. even Test::More didn't arrive until 5.6.2, unfortunately...
I should really use Test::More for my other modules, though.
'Of course, that changes your diagnostic output and breaks what you've described.'
really? are you saying the newer test modules remove this (very useful) behaviour? that seems a pity!Re: Test::More and backcompat
markjugg on 2008-01-29T18:55:42
In SpamAssassin, we have a project philosophy of attempting to keep our CPAN dependencies at a minimum, and we support perl 5.6.1 as a platform. even Test::More didn't arrive until 5.6.2, unfortunately...Assuming you have a compatible license, you can do what the darcs project does: Ship a private copy of Test::More with the test suite suite. This eliminates problems between your test suite and different versions of Test::More as well.
Re:Adding to the output though...
Ovid on 2008-01-30T09:47:32
You know, you might want to chat with Andy Armstrong. He's been toying with the idea of creating a reversible debugger for Perl. As I understand it, it's similar in concept to how video works: occasionally save the state (a "key frame") and save then a bunch of diffs until you need to save the state again. With that, you could just note the "time" of the test failure and persist its state. You could then reverse to a previously known state, change parameters, run things forward again, etc.