Having posted to the mod_perl list at the start of the week that our fail over system had never been tested in anger, one of our servers died today (should have known it would!)
We use wackamole on the front end (HTML / Images and proxy to apps) so if one goes down the other automatically takes on the IP address. We have two back end machines, one for DB and one for Apps. So all the front end machines proxy to the apps box and that then connects to the DB machine.
Anyway, it was our apps server (sun box - been up for over a year) that has died, I'll take a trip to the hosting company next week to see if I can fix it. So, install all our modules (already on the box because of rsync) on the DB (also sun) machine, restart mod_perl, alter the front end machines /etc/hosts file so that mp.XXX.com now points to the DB machine and vloa - it all works.
You get a really good feeling when the shit hits the fan and it only takes ten mins to clean it off!
Considering just swapping out the sun box as it doesn't have a support contract and we don't have a test box (everything else is debian on Compaq).
Oh, one last note, we've had 5 compaq servers over the last year, three of which (after about a years continual use) have died and needed a new power supply. On the plus side Compaq have always had an engineer there within 4 hours as their support contract dictates (though we've never _needed_ them there in 4 hours).