Walking into a new codebase: tips

scrottie on 2010-06-02T07:14:31

Find out where the cache is. Rig tests or a test script to delete it before each run.

Make a file with the commands necessary to reinitialize the database to a known good state. Make that an optional step in running the tests.

Use the test suite as docs to find out which APIs are important and how they're used.

Use the debugger to figure out which cross section of the code runs for a given task. Step through with a mixture of 'n' (next statement) and 's' (single step including stepping into function/method calls). As you trace through execution, you'll learn which utility subroutines can safely be taken for granted and skipped over. Note which tables data gets stuffed into and pulled from.

Strategically pop over to another window and inspect database tables while in the middle of single stepping.

Write SQL queries to inspect database state. Even if you like DBIx::Class, it's handy to be able to simply write "select count(*) from foo group by state order by state desc" and other things.

If tests don't already inspect the tables for the correct left state, add tests that do. The utility queries will start life in a notebook or scratch file, get refined, then maybe wind up in a stub .pl, but don't stop there. Add them to the tests. Yes, tests should only test function, not implementation, but, in one sense, the API is probably just a database diddler with sideeffects, and its correct operation could be specified as not mucking up the database.

Get the code running on your local machine -- that should go without saying. Mock services, APIs, commands, and whatever is necessary to get the core code to run. Mock stuff until you get the code passing tests again and then start modifying the code. From one project, I have a mock implementation of a Vegas slot machine. My cow-orker and I referred to it affectionately as "ASCII Slots". It handshook, spoke XML over SSL, had a credit meter, tilted on protocol violations, and the whole nine yards. Furthermore, it could be told to abuse the client with a string of simulated network errors including every possible scenario for the connection having gone away after a packet was sent but before it was received, including for packet acknowledgments.

Before you start work, run the test harness piped into a file. After work, pipe it into a different file and diff it, or add the first one to git and let git show you what tests have started passing/failing when you do 'git diff'.

Comment the source code you're working on with questions, speculation, and so on. This will help you find stuff you were looking at by way of 'git diff'. You can always checkout HEAD on that file to get rid of it or else just delete the comment, but you may find that the comments you write to yourself as notes while you're exploring the code have lasting value.

Similarly to saving test harness output, save full program traces created with perl -d:Trace t/Whatever.t. Trace again later and diff it if you find that an innocent seeming change causes later tests to break. This can dig up the turning point where one datum value causes a while/if to take a different route.

If execution takes a route that it shouldn't have and meanders for a while before actually blowing up, add a sanity check earlier on.

-scott