My followup to HTML::Lint, HTML::Tidy, has just been released at version 1.00. It does NOT include the Test::HTML::Tidy wrapper, but it DOES include a handy guide on how to build libtidy, and a transition guide for HTML::Lint users.
What would you do with HTML::Tidy? Something like this:
my $tidy = new HTML::Tidy;
$tidy->ignore( type => TIDY_WARNING );
$tidy->parse( "foo.html", $contents_of_foo );
for my $message ( $tidy->messages ) {
print $message->as_string;
}
or some other level of automated HTML checking. With Test::HTML::Tidy (which I hope to get out tonight), you'll be able to do
Re:W3C Validator
LTjake on 2004-02-26T13:55:02
Hrmm, i don't think so.
Check the source page. No mention of Tidy. Also, it says "OpenSP is the SGML and XML parser used by the service". So i assume it parses the output from that.
Re:W3C Validator
sumdeus on 2004-02-26T13:59:45
I believe the W3C Validator does not use tidylib. It uses.. and some other good stuff. The source is available. I would be very interested to know the similarities and differences between tidylib and this validation. [I could go look it up!]use File::Spec qw();
use HTML::Parser 3.25 qw(); # Need 3.25 for $p->ignore_elements.Re:W3C Validator
Dom2 on 2004-02-26T15:18:39
The W3C validator requires an installation of OpenSP, which is a fairly heavyweight requirement.I'm not sure quite what tidylib does, but I'm going to give it a play and see what it does. If it's faster than onsgmls, then I'm all for it!
Your other option for validation is to get libxml2 (in its perl form XML::LibXML) set up. The disadvantage (which it shares with OpenSP) is that it requires you to have all the catalogs for html/xhtml set up correctly. I'm assuming that tidylib has all that sort of stuff hard coded.
-Dom
Re:W3C Validator
petdance on 2004-02-26T16:40:05
There are CGI tidy interfaces out there, so you can see what tidy reports on. tidy also does cleanup on the HTML, and prettifies it for you, although HTML::Lint doesn't support that.