The computer that cried "Wolf!"

petdance on 2003-09-09T14:43:00

Signal-to-noise ratios are awful in the world around us. Consider the following:



  • A car alarm goes off in a parking lot. You don't look to see where it's coming from, much less call the police.

  • Your office has an overhead paging system. A couple of times an hour, a voice breaks in: "Bob Smith, dial 847. Bob Smith, dial 847." Sometimes, if it's actually important, the voice begins "may I have your attention, please."

  • You call a customer support line. "Please listen carefully, as our options have changed." Since when? Since the last time you called six months ago and it had the exact same message?



In each case, something has tried to get our attention, but we've learned through repetition that we don't need to pay any mind to it. It's become mental clutter.



Now, consider some computer-side cases where we've done the same things with how we do our work:



  • You're grepping through log files and need to dump results to a temporary file, like ~/foo. But before you execute the command, you notice there's already a file called foo. It's got some list of filenames in it. What are they? Are they important? OK, you'll dump to ~/foo2, but there's foos 2 through 7 already. Well, time to make foo8.

  • You're doing an emergency rebuild of a server, and you need the CD-ROMs for some RPMs. You open the desk drawer to find at least a dozen discs, three of them labeled "RH 7". Which is the one you want? Does it matter?

  • A process crashes, and tells you that it's dumped a core file in /tmp. There are 100+ files in /tmp.

  • You've been getting joe-jobbed with fake spam bounces recently. You set up a rule to throw away anything that comes from MAILER-DAEMON@.

  • You've set up an hourly smokebot to run automated tests on your project and report failures. One problem is a pain to deal with, so you ignore it for now. Each morning, you delete the dozen failures from the night before, not knowing that another bug has been introduced, too, since you're ignoring the reports.




Our lives are filled with helpful warnings, and we ignore many of them. Worse, our lives are filled with clutter that we let accumulate so that we don't notice when something is going wrong.



So what to do?



Deal with every problem. Don't brush the problem aside by making a mental note to deal with it later. That method doesn't scale. In fact, NOTHING that relies on a single person scales. Make the computer do its work.



If you have to deal with it later, then put it somewhere, like a ticket in your RT ticketing system. The point is to reset the automatic idiot light so that you never think "Oh, I know about that problem." If your computer is telling you something is wrong, then it better mean something.



The job of the computer is to do the repetitive, mindless work. Your job is to think. If you have to waste brain cycles on whether a given warning is actually a problem, then you're not using your computer to its fullest.



The job of de-cluttering falls on you, and only you. If you don't clean up your crap, who will? Do you expect the crap-cleaning fairy to come and take care of it? If you want to set up your own automatic crap-cleaning fairy that doesn't gloss over problems, that's great, but set up something.



If your system is constantly crying "Wolf!", then it's doing you a disservice. Whip it into shape and make sure you only get alerted to problems that are real.

(This article is also at my oreillynet blog)


Bogus error logs

Ovid on 2003-09-09T15:57:50

Wonderful stuff. Thanks Andy. I once had a large system once where there were tons of warnings been written to error logs. Some things were unitialized, others were stupid debugging messages that hadn't been removed, still others were in a similar silly vein. I knew what all of those messages were, but I ignored them. In the next rollout of the software, my goal was to eliminate all of those warnings. If it appeared in the log file, I wanted it to be an actual bug. I was very happy with the results. It worked wonders for me.

Re:Bogus error logs

chromatic on 2003-09-09T17:20:23

Hmm, I seem to recall eliminating lots of those warnings, unless you're talking about the OTHER really noisy system.

Re:Bogus error logs

Ovid on 2003-09-09T17:53:02

I was referring to the ICAP project. I know you did some work on that, but I suspect you're referring to some of the e-commerce work? I don't know.

Thanks

jordan on 2003-09-09T18:01:15

I think this is my biggest failing. I manage systems that generate lots of noise and I ignore it.

I know this is a huge problem. Thanks for the reminder. I won't treat it as noise.

The strict discplinarian approach to build noise

dws on 2003-09-09T18:22:44

On a current project (in J2EE land), I won't let a build succeed (to the point of generating an installable artifact) if any of the unit tests fail. There was some minor grumbling about this originally, but now nobody gives it a second thought.