No More Crashes

vek on 2002-04-24T15:41:46

So the two Sun Engineers came out yesterday to watch the Oracle installer. Here's the wierd thing. Even though almost every possible piece of hardware had been replaced in that box over the course of the last two weeks, the Sun 'experts' decided to replace everything again - only this time do it all at once.

Seemed strange but they obviously knew what they were doing as the Oracle installer ran like a charm - no more crashes. I heard a really loud 'Yeeees' from our DBA's cubie - there were grins all round :-)

Hardware problems

ziggy on 2002-04-24T16:16:50

Sounds like the Voodoo controller(tm).

Years ago, some friends of mine had a couple of PDP-11s (mostly PDP-11/34's but there was enough to run a PDP-11/44 or two from various parts). The main RK07 disk controller worked fine, but it would flake out every few weeks.

How to fix it? Replace the mostly-working RK07 controller with the RK07 Voodoo Controller(tm). Boot once, and watch the Voodoo Controller fail. Replace the Voodoo controller with the mostly-working controller and reboot. Then stand back and watch the RK07 work for a few weeks or months without any errors whatsoever.

No one really understood what was going on here, just that the process Just Worked(tm). Since these zombie machines were resurrected from the dump heap, DEC engineers, replacement parts and service contracts weren't an issue -- they effectively didn't exist. :-)

probably a marginal CPU

hfb on 2002-04-25T15:39:47

I had a sparc 20 watchdog every night for a month on a particular SAS job...finally replaced both CPUs and that did the trick. Next time ask them for a t-patch for the kernel to trap the exceptions or just have them replace the CPUs up front as it's a waste of time hunting down this sort of problem that, while rare, isn't so rare as to wish the waste of time hunting it down. Also, you could, if you had multiple CPUs either bind the installer to a CPU or, using psradm, pull them offline one at a time to see if you could get one to barf...of the 3 times I've had this happen though, watching them while stressing them doesn't work :) Like watching a pot...