Buggy locking code.

Abigail on 2006-07-04T19:18:14

I was using the following code to prevent two instances of a program to run at the same time:

  use Fcntl qw [:flock :DEFAULT];
  my $lock_file = "...";
 
  sysopen my $fh => $lock_file, O_RDWR | O_CREAT
      or die "sysopen: $!";
  exit unless flock $fh => LOCK_EX | LOCK_NB;
 
  ... Code ...
 
  END {! -f $lock_file || unlink $lock_file || warn "unlink: $!"}
It turned out to be incorrect. Can you spot the mistake?


Easy

Aristotle on 2006-07-04T20:02:10

Operator precedence.

Re:Easy

Abigail on 2006-07-04T20:11:11

No, operator precedence isn't the problem. The problem hasn't anything to do with syntax.

Re:Easy

Aristotle on 2006-07-04T22:35:52

Then I guess I just found a bonus bug for you.

$ perl -MO=Deparse,-p -e'! -f $lock_file || unlink $lock_file || warn "unlink: $!"'
((not -f($lock_file)) or unlink(($lock_file || warn("unlink: $!"))));
-e syntax OK

Admittedly, it’s not a fatal issue.

flock(2)

knew on 2006-07-04T22:23:43

"On success, zero is returned. On error, -1 is returned, and errno is set appropriately."

whoops.

Re:flock(2)

Aristotle on 2006-07-04T22:39:15

No. Second sentence in perldoc -f flock:

Returns true for success, false on failure.

Re:flock(2)

knew on 2006-07-04T22:45:20

Bah, so it is indeed. Back to looking at it then :)

Ah, got it.

Aristotle on 2006-07-04T22:48:43

A third process will erroneously be allowed to launch after the first of two concurrently launched processes has already exited, despite the second is still hanging around.

But how did you fix it? I can’t see a correct solution other than just not cleaning up the lockfile.

Re:Ah, got it.

Abigail on 2006-07-05T05:21:47

Indeed.

It's easily fixed - set a variable right after getting the lock, and don't delete the file unless the variable is set.

END{}

knew on 2006-07-04T22:50:02

oh hehe.

END {! -f $lock_file || unlink $lock_file || warn "unlink: $!"}

or in other words...

"Let's delete the lock file if it exists when we're ENDing, even if we're doing so because we've detected a previous instance of the program"

Lock DATA?

nik on 2006-07-05T07:16:53

If the processes are running on the same host, why not lock the DATA filehandle?
flock(DATA, LOCK_EX|LOCK_NB) or die "Already running.";
should do it. No need for messy temp files to remember to cleanup.

Re:Lock DATA?

Dom2 on 2006-07-05T07:49:47

Heh, locking the DATA filehandle is a neat idea.

But more to the point, who would consider using flock() over NFS? That's scary bad stuff. You're now relying on correct implementation of lockd everywhere, which is not guaranteed. Better to use some form of lockfile, the way that the MTAs & procmail do...

-Dom

Re:Lock DATA?

jhi on 2006-07-05T09:37:26

Slightly more reliable than creating a lock FILE is creating a lock DIRECTORY. I have no doubt some NFS implementation gets this wrong too, but it's quite hard to create a directory non-atomically.