Overcoming Misbehaviours in Code I Did Not Write

Shlomi Fish on 2008-07-23T18:36:14

Yesterday, when I tried to validate the RSS feed of my fortune cookies, I ran into a Unicode encoding problem. Seems like a Unicode character ("→") was converted into its individual bytes and then encoded using SGML entities. After a long time of debugging it, I found out the problem was with XML-Atom and was easily fixed using:

$XML::Atom::ForceUnicode = 1;

This behaviour is documented in the XML::Atom::Feed documentation, which I didn't bother to read because I believed XML::Feed would do the right thing. The reason I ran into this problem there was because I generated an Atom-based XML::Feed (so I can also have an Atom feed and then converted it to RSS.

Today I encountered, a similar problem, this time with CPANPLUS. I wanted to write a script to syndicate the list of CPAN modules for a the Perl-Israel Israeli Perl Projects page. One thing I noticed was that calling $author->distributions() took a long time. After some debugging, I noticed that the culprit was in CPANPLUS::Module's dslip() method which went over all modules list in O(N) time looking for modules whose prefix is the current module. This was fixed using:

{
    no warnings;
    # This is an optimisation because dslip is incredibly slow,
    # and it badly affects $module->clone() .
    *CPANPLUS::Module::dslip = sub { return ' ' x 5;};
}

This made my script run much faster. Not instantaneously, but definitely faster.

So this way, I was able to fix the two problems that I had. Now I'm happy and can go on with the rest of my life.