This weekend I did some work on my RSS module, uploaded a new site for XML::RSS, and looked at the article I promised to write for brian. I got my SSH problems fixed, and after starting to read my SSH Book felt much more comfortable.
I then played a few games, listened to music, made some bread (well loaded the machine), then decided to do some RSS feed tests - I've had a few feeds messed up for a while, so I thought I could get to the bottom of the problem. Most of the feeds with problems were simply dead, it was just a case of working out where they had gone, and fixing the link or deleting them. One feed however perplexed me, The Guardian had a live feed, but the XML Parser in XML::RSS was croaking. Upon investigation I discovered that their Vignette V/5 CMS thinks that valid XML can have comments before the XML declaration at the top.
I know Vignette well, the company I use to work for competed with them head on, and lost. By all accounts it was/is a pretty evil product, costing $millions to install and use, with a terrible reputation. Mind you I worked for a competitor, so we had a biased view....
It's interesting though that this top-price, so called flag-ship Content Management System, can't produce valid HTML - W3 Validator results, but then most big companies can't manage that either. The SGML/XML comments at the very top of the page while not fatal to most browsers, have two problems, any attempt to parse pages with an XML parser will fail as the XML is invalid, and IE will not be able to read the XHTML doctype if one is actually set!
Companies the size of Vignette, and there are quite a few, really shouldn't be allowed to be in bodies like the W3C if they can't even make their own products compatible with the standards they helped to create. I don't recall the link, but someone pointed out that almost none of the W3C corporate members had a corporate home page that was valid HTML!
Find that and much more on the fount of all knowledge that is scribot