Red Flags talk

xsawyerx on 2009-03-02T11:44:01

Yesterday I gave a talk on Red flags in programming, inspired by the excellent talk (lecture?) Mark Jason Dominus did (which I think I mentioned before) - which is available on Google Videos and a few other articles he wrote on the subject. Basically I took his lecture and tried to add some stuff and order it a bit. Definitely wasn't as good as the original, but was suitable for 10 coworkers.

A friend asked me for the talk notes, so here they are:

Red Flags in Programming

Intro: Red flags are signs that you should check something again. Some things don't necessarily need to be changed, but they should be examined and verified that this is the correct way to do it. Try and catch aren't necessarily evil, they're used for a specific something, but when you see one, you should ask yourself if that's really where you need it. Sometimes the "wrong" solution is the exact solution.

Repeated Code:

Intro: Coders think that their product is code, but it's not. It's not even programs. Our code (and compiled code - programs) are just the means to an end. When taking the company's interest in hiring us, they hire us to write the code to do the bank wire. If they could do it without a program with no problems, they would fire us, and do it. That's the main point here. We aren't here to provide code, but functionality that we obtain through code. So, we count SLOCs (Source Lines of Code), we count the number of lines each person put into a code repository, we count it here and there, but that's not really important. It's not about amount of code that we write, it's the quality.

Problems:

  • We recognize patterns, and so when there's duplication, our minds try to automatically filter it out.
  • Twice (or N fold actually) as long to read, to understand, to maintain and to test
  • Does not make it go any faster
  • Updating multiple places with the same code is boring and makes us distracted and we lose focus and make mistakes
  • Twice (or N fold) as many lines to have bugs in

Solutions: The trick is to understand how to abstract your repeated code. That could be a class (package), an abstract class (role), a function, a loop, a switch or dispatch table.

Reinventing the Wheel

Example: Date calculations.

Problems:

  • You probably won't do it better than whoever writers a complete module/library for it. You might, but probably won't.
  • Repeated code, really...
  • Development time and maintenance time grows because you have to take care of that chunk of code as well.

Exceptions: Brad Fitzpatrick (of The Schwartz, LiveJournal, MogileFS, Perlbal and more fame) says that you reinvent wheels when the wheels you have suck. Sometimes the design of something is so substantially different from what you need or does not provide the functionality you need by far or you wouldn't be able to use it that you might have to reimplement something. A good example would be git.

Solutions: use modules, libraries, try to remiplement only a little. Write patches for others to add the functionality you need.

No Default in Switch

Example:

   if ( is_alpha  ($string) ) {}
elsif ( is_numeric($string) ) {}

Whenever you write an if() {} elsif() {} and you do not provide an else {} you should stop and ask yourself if that's correct. Most times you would need an else {} at the end, like a default clause in a switch statement.

Problems:

  • Unexpected behavior, which can lead to...
  • Possible crashes in the program
Long Switches

It should be easier to read a dispatch table than a long if() elsif() elsif() else. At least that's how it is for me.

Try and Catch (for workflow)

Some people like to write with try and catch. There's a lot of argument about that. This is what I think, and you're welcome to disagree. Basically try and catch should be used (IMHO) whenever you cannot control what will happen and when basically ANYTHING CAN HAPPEN. Specifically the situation expressed here is when people use try and catch in the code as workflow, which is... horrible.

Problems:

  • Have to travel up the stack to check where it bailed
  • Harder to maintain
  • Really unclean code
  • That's why we have conditionals and loops for

Solutions: should be used when running an action that might crash or cause some unrecoverable error and completely unexpected behavior

String Booleans

Example:

if ( $bool eq 'true' ) {
  DO SOMETHING
} elsif ( $bool eq 'false' ) {
  DO SOMETHING ELSE
}

Problems:

  • Most programming languages consider "false" (as a string) to be TRUE. Confusing...
  • Unnecessary value check. Easier and more idiomatic to write if($bool) {}

Exceptions: whenever you work on input you can't control. For example, openssh sometimes uses "yes" and "no" and checks specifically against them.

Solutions: use if($bool) for TRUE, use 0 (as a number), zero-length strings and undefined (undef) for FALSE.

Using External Binaries

Example: system 'ls', my @files = `ls`.

Problems:

  • Incompatible
  • Not portable
  • Unexpected results (`ps` for example is different on BSD and has different switches)
  • Insecure!

Exceptions: sometimes you get to work with closed binaries, or cannot do it in a different way.

Solutions: shared libraries, bindings, APIs, modules that control binaries more easily.

Creating Intermediate Programs

Creating intermediate programs is when - in the code - you write code into a file and then execute that file.

Problems:

  • Insecure: you don't always know your environment and it would take a lot of code to check everything to prevent a man in the middle attack on your environment
  • Hard to maintain in unrelated code

Solutions: if it's in the same language, write it in a subroutine and execute it. If it's in a different language, try an Inline:: module. If you can't do either, put it in a different file and just execute that file. Still relatively insecure, but much better. You can also create a template and create the script from that

Empty if() Clauses (DWIM/SWYD)

Example:

if ( $something ) {
} else {
   CODE
}

Problems: This is just an empty if() for the sake of the else{}. Just.. horrible.

Solutions: SWYD (Say What You Do) - meaning if you want to do something in case something is not true, you do if (!$something) {}. You can do unless ($something) {} as well, but I wouldn't.

Array Counters

Example:

foreach my $i ( 0 .. $n ) {
  $array[$i] = "something;
  $array_count++;
}

Solutions: Some older languages don't have a simple way to check the number of elements in arrays. Very high level languages don't have that problem. print $#array + 1, print scalar @array.

Compound Data Structures

Example:

my ( $self, $age1, $age2, $age3, $age4 ) = @_;

Solution: use compound data structures. That is, data types that are supposed to help with this. Basically, use the correct data types that are available to you.

my ( $self, @ages ) = @_;
C Style for() Loops

Example: for (my $i = 0; $i <= $max; $i++ ) {}

Problems: this isn't really a problem, but we have much clearer loops to use instead of it. Easier to read and understand and need less code written.

Solution: foreach ( 0 .. $max ) {}



DRY, RINSE, REPEAT.

Some concepts to help out

  • DRY: Don't Repeat Yourself! - don't duplicate code, abstract it!
  • KISS: Keep It Simple, Stupid! - don't write overtly complex things. Clean design yields clean code, and more understandable and maintainable one. And yes, I do know some things are complex.
  • YAGNI: You Aren't Gonna Need It! - this is a relatively new one (at least to me) and it basically says you should start from what you need and work your way up from there. You shouldn't try to write all the parts you aren't gonna need yet, because you might have to redesign them again, reimplement them, or worse - just end up not using them
hope that's useful for anyone


One tiny nit

Ovid on 2009-03-04T07:43:36

Nice writeup. However, you have a tiny log error in your for loops. They're not equivalent.

$ perl -le '$m = 3;for ($i=0; $i<$m;$i++) {print $i}'
0
1
2
$ perl -le '$m = 3;for (0..$m) {print}'
0
1
2
3

Change that first $i < $max to $i <= $max and you'll be fine.

Re:One tiny nit

xsawyerx on 2009-03-04T09:08:13

Whoops...

Thanks, and thanks!

Empty if clauses can serve to document

Ed Avis on 2009-03-04T10:46:51

An empty if-clause with a comment can serve to document your thought processes.

if (is_mungo_sorted @items) {
    # The list is already in the order we need.
}
else {
    # Need to sort them in bogological order.
    @items = rearrange @items;
}

There's often a more elegant way to express things but an explicit do-nothing is sometimes the way I think about the problem, and it's good to have a way to comment that.

BTW, why doesn't the comment box understand <pre>?

Re:Empty if clauses can serve to document

xsawyerx on 2009-03-04T10:57:21

That's true. Shlomi Fish also expressed this point.

I personally try not to write things like that because I get irritated seeing an empty if(). It feels like there should be something there and I (or whoever wrote it) has forgotten or hasn't implemented it. So, to me it adds to the confusion but I can see how other people find it useful and expressive. Perhaps there should be a comment that says # already ordered. do nothing.

And yeah, I tried using h2, h3 and other stuff. These boxes don't get much...

One exception to creating intermediate programs...

DiamondInTheRough on 2009-03-13T18:49:40

In Perl::Dist(::WiX), the goal is to create and package a new installation of perl, with added modules as required. (upgrading core modules, providing SQLite for CPAN use, etc.)

The last part right now is implemented by writing out scripts from the old perl that are executed in the NEW perl to install modules there, and to query which modules require upgraded.

What would be your solution to that? Note that we are using the current version of perl (I use 5.10.0) to create 5.8.8, 5.8.9, and 5.10.0 installations.

Using a template, obviously

xsawyerx on 2009-03-17T06:41:37

The problem with writing intermediate programs by wrapping them in a q(), qq() or endless concatenations is that it's unclear and hard to maintain. There are a few options that help make it easier and more understandable and thus more maintainable but you can't always use them:

Using a template such as Text::Template or Template Toolkit is the best option IMHO. That is, if you're able to read files from your environment. Considering you're writing to it, you can probably read.

Manually creating a template and putting it in some scalar or hash with indication of where to change and then in the code just search and replace on that template. Example:

$templates{'startup_check'} = q(
    BEGIN {
        print "hello ##WHAT##";
    }
);
# ... later in the program
search_replace( $templates{'startup_check'}, '##WHAT##', 'world' );

You can put it in the __DATA__ section in the file. A bit trickier but will definitely be easier to maintain.

When all else fails, at least you know you noticed it and tried to do something about it. There's a lot of times where red flags don't necessarily mean "I've made an error" but instead mean "this isn't the best practice but it's the correct one in this case".

Thanks for the comment and sorry for the long delay in reply.