So I'm testing some prototype code at $WORK to reimplement sections of an existing dynamic site. There are a lot of pages, and there's no way I could accurately test the prototype without some automatically generated test suite.
So I wrote a couple of scripts to use mech to examine a baseline web server, and generate test scripts. Currently, I have upwards of 2000 scripts generated, making ~250K tests on the baseline and test sites. (Lots of URLs to test, plus meaningful variations in GET arguments, plus variations in cookie authentication...not the kind of test suite you want to code by hand.)
I started out with some test generators, following the principle of DTSTTCPW . There were some bugs, but nothing that was causing me grief. The code isn't pretty, but it works. And it's reasonably well factored, so it's easy to add new kinds of tests to a test script. So I ignored the bugs for the moment and focued on writing my prototype -- adding features and testing bug-for-bug conformance.
Here's the problem. One nasty bug had to deal with handling a request that does not return a page. Testing $mech->success() was one of my tests. But really, failing that test should cause the entire test script to fail prematurely. Also, the test generator should note when requesting a page that it did not get a successful response, and fail to produce a test, instead of producing a test script that expects the same failure messge to be returned every time.
Today, this bit me but good. In the process of rebuilding 675 tests, there are 10 tests that cause the web server to crash. In both the baseline and test servers. My tests weren't giving me good feedback, just catastrophic failures. One test would crash the server, and the subsequent tests would have hundreds or thousands of failures. The test generator also generated a whole mess of bogus tests, because it wasn't smart enough to note that it didn't get a page back.
So these were two glaring bugs in my test generator. Two bugs I knew I would have to fix eventually. Two bugs that were not on the critical path. And having these two bugs lurking in my code meant that I wasted a whole afternoon figuring out why my servers were crashing, isolating the problematic URLs, but only after fixing these bugs.
Here's my question for the peanut gallery: should I have fixed these bugs as soon as I noted them (even though there was no critical reason to fix them)? Or should I have fixed these broken windows when they started causing problems (even though the time to fix would be much greater)?
While I would argue that you should fix the tests right away, sometimes one winds up in situation where fixing a test might not always be the best choice. Perhaps it's unclear what the appropriate fix is. Perhaps the code needs to go in right now and the fix can't get done in time. While I have not actually faced such a situation, perhaps the appropriate response is to temporarily disable those tests to ensure that, at the very least, you are not conditioned to ignore test failures? It's not a pleasant solution, but it's better than ignoring broken tests (though it's kind of like taping cardboard over the broken window, so I'm hesitant to even mention it.)
Re:A middle ground?
lachoy on 2004-11-24T00:20:03
There's nothing wrong with cardboard over a broken window if it's freezing out and the glass repairman can't fashion a new pane for you until next week. Ignoring the tests is more like closing the door to the room with the broken window and not caring if it gets cold in there.(I realize I inverted the metaphor here by talking about the inside of the house vs. the outside, but I couldn't resist.)
Re:A middle ground?
ziggy on 2004-11-24T03:23:01
There are a couple of issues here. You're talking about ignoring a test that's currently known to fail. That's pretty clear, and I'm doing that (now that I know which tests cause the server to fall down).The real problem I want to solve is what you do when you have a known bug that needs to be fixed eventually. Specifically, the bugs in my test generator (and the bugs in the tests it produces) that cause lots of grief if/when they get exercised (like they did today, when I started to ping URLs would crash the server, yet not give any feedback that retrieving the URL fails).
Depends. Will you keep using the test generator? How much? How pressing is it to work out immediate issues instead of doing some useful yak shaving? How much more time would fixing these bugs and regenerating the tests take as opposed to fixing the tests themselves?
I would say that unless fixing the tests is much easier and time is pressing, you should opt to fix the generator. Remember: makeshifts last the longest. It is not a wise move to pay a high interest rate with a small upfront investment as opposed to a high upfront lump sum that results in lower interest, unless you cannot afford the lump sum.
Re:
ziggy on 2004-11-24T14:32:27
Yes.Depends. Will you keep using the test generator?Constantly. The tests model a dynamic website, where the content changes over time. The tests focus on rendering the dynamic content, not the content itself (which I cannot control). So the test generator is a requirement in that respect. Periodically, the tests need to be re-generated, just to factor out meaningless changes in content. Occasionally, the site needs to be crawled to find tests that need to be generated.How much?That's not the issue. There are 2200+ auto-generatedHow much more time would fixing these bugs and regenerating the tests take as opposed to fixing the tests themselves?.t files that exhibit the same problem. The only reasonable option here is to fix the generator and regenerate them, especially since the test generator is a going concern. The real question is, when to fix the test generator? When you first find the bugs (and it's not a pressing issue), or when the bugs manifest themselves (and the symptoms are masked by secondary and tertiary effects, forcing you to waste an afternoon sleuthing out the true problem)?
Re:
Aristotle on 2004-11-24T15:01:03
Ah, I misunderstood. Well, in that case, I have a hard time coming up with a scenario in which deferring the fix of a central part of the program's operation is justified. Otherwise, how can you trust its output?
At best, if the bug seems minor but it's easy to notify the user when it's being excercised (preferrably by marking the output in your case, I guess), I might opt to do it that way. But even that would only be during development, as long as other getting parts of the code working has higher priority.
Fix sooner rather than later
cbrandtbuffalo on 2004-11-24T17:02:52
I'd agree that it would be best to fix them when you become aware of them. In my experience, I've become so dependent on tests during development that small bugs in the testing components can really mess me up. I've had a similar experience where I know when something is broken and convinced myself I know exactly how it will manifest itself. But then I've burned myself with an unforeseen side-effect of the know bug/failure and spent a bunch of time tracking something down based on shady test output.
So I think the lesson I've learned is fix the testing bugs as soon as possible, if it is at all possible to do so in the context of deadlines. I've even argued for more time to fix testing configurations knowing that more change requests would come after the current code was put into production.