[MarkovBlogger] AOL Fighting Spam berjon.com (re)design .

jjohn on 2003-11-21T08:33:42

I was at 18th & H, across Pennsylvania Ave.)

 

[*] Perl6 RFC #1138: Replace the shift builtin with something clever this early in the lecture area tomorrow, as part of it is $50 for the last frontier back in the Dock and System Events. It's super sweet! Note how it works on Mac OS 9.2.2, apparently over the last number on it, but it still didn't get to have a list of links to it again after a couple of late night talking, particularly with Quinn and Danny from Need to Know. Rael appeared at most of life.

Since my access to the tops of both those modules and articles that I sound like crap, too. Progress!

Just before 2pm, Tom Brokaw is presenting some images coming off the ground. "Tell me how the people that I can probably provide some of the seventies, prevasive sex, drugs and violence, to present a future iPod revision has group browsing for men), gives me the they have to take over support of the RIAA for some drinks instead of civil rights.

I think work here if I haven't done an amazing experience.

(Brought to you by jjohn's MarkovBlogger (™): If it hits kernel panic about 50% of the house is much better model for teenage girls, you'll know it fairly well when I've not touched them in March and not even supported until ID3v2.4.0, which most software will eventually have an installer license. I did not have net zero effect on even the military knows that it's likely that it put a cap on it, as compared to natural light.

I considered the dollar's recent pludge against the acquisition cost. This is the direct object of the same reason I don't.

(Brought to you by jjohn's MarkovBlogger (™): If it makes sense, it's not MarkovBlogger (™).)


Feedback loop

rafael on 2003-11-21T11:54:19

As I see, MarkovBlogger reads its own entries. Bug or feature ?

Re:Feedback loop

jjohn on 2003-11-21T12:59:19

It should not be reading its own entries. I do filter MB entries. However, if other blogs quote MB, there's little I can do about that. It is very, very difficult to debug a particular entry to see what when wrong. I will have a look.

The problem and solution

jjohn on 2003-11-21T13:19:45

I think this snake is indeed eating its own tail.

To debug this, I first grepped through all of the source material, which is organized into a subdirectories and files.

A typical source file looks like this:

«subject: OSCON - /$3(.*)$/

I write this on Saturday. My last entry was Wednesday. Thursday is a happy blur now. I know I went to some good talks, but I can't remember them now, even on pain of death. What I do have a clear recollection of is going to the Mummy Cafe with Quinn, Danny, David Blank-Edelman, Gnat and Jenine. Like the queen's tomb in the Khufu's (neé Cheop's) Pyramid, this restaurant was small, subterranean and devoid of treasure. However, the food was vaguely greek/mediterranean and so was easy to consume. Although no alcohol was imbibed there, for some reason conversation at the table was preempted as we all watched the ice cube that was impaled by David and hung across his glass slowly melt away to the point of failure.

At the time, this seemed very, very important. Maybe the food included some kind of "pharaoh's surprise."»

I looked through all of these entries with this command:

find . -exec 'grep' '-l' '\[MarkovBlogger\]' '{}' ';'

Only my own MB entries were found. This argued strongly that the markov.pl isn't filtering the entries correctly. Here's the relevant section of code:

for my $file (@ARGV) {

  while (<>) {
    if (/^subject:/) {
      s/^subject://;
      next if /^\s*\[MarkovBlogger\]/;
      fill_table(table   => $subject,
                 state   => $s_in,
                 line    => $_,
                 'keys'  => \@last_subject_keys
                );
    } else {
      fill_table(table   => $body,
                 state   => $s_in,
                 line    => $_,
                 'keys'  => \@last_body_keys,
                );
    }
  }
}

As you can see, the code appears to be looking for subject lines that contain the distinct sentential phrase. However, a closer reveals the ugly truth: only that subject line is skipped; the rest of the file is still processed!

With the careful application of labels, the fix appears to be the following:

FILE:
for my $file (@ARGV) {

  while (<>) {
    if (/^subject:/) {
      s/^subject://;
      next FILE if /^\s*\[MarkovBlogger\]/;
      fill_table(table   => $subject,
                 state   => $s_in,
                 line    => $_,
                 'keys'  => \@last_subject_keys
                );
    } else {
      fill_table(table   => $body,
                 state   => $s_in,
                 line    => $_,
                 'keys'  => \@last_body_keys,
                );
    }
  }
}

This sort of bug would never have happened in Java, because I would never have tried to write this in Java. :-)

A better (working) solution

jjohn on 2003-11-21T13:31:20

FILE:
for my $file (@ARGV) {

  if (open my $in, $file) {
    while (<$in>) {
      if (/^subject:/) {
        s/^subject://;

        if (/^\s*\[MarkovBlogger\]/) {
          close $in;
          next FILE;
        }

        fill_table(table   => $subject,
                   state   => $s_in,
                   line    => $_,
                   'keys'  => \@last_subject_keys
                  );
      } else {
        fill_table(table   => $body,
                   state   => $s_in,
                   line    => $_,
                   'keys'  => \@last_body_keys,
                  );
      }
    }
    close $in;
  } else {
    warn "can't open $file. skipping";
  }
}