HTML::Tidy 1.00 is finally out

petdance on 2004-02-26T04:30:04

My followup to HTML::Lint, HTML::Tidy, has just been released at version 1.00. It does NOT include the Test::HTML::Tidy wrapper, but it DOES include a handy guide on how to build libtidy, and a transition guide for HTML::Lint users.

What would you do with HTML::Tidy? Something like this:

use HTML::Tidy;

my $tidy = new HTML::Tidy; $tidy->ignore( type => TIDY_WARNING ); $tidy->parse( "foo.html", $contents_of_foo );

for my $message ( $tidy->messages ) { print $message->as_string; }


or some other level of automated HTML checking. With Test::HTML::Tidy (which I hope to get out tonight), you'll be able to do

html_tidy_ok( $html, "HTML is properly tidy" ); in your *.t scripts. Whee!


W3C Validator

cbrandtbuffalo on 2004-02-26T13:18:06

So does this have the same functionality as the on-line W3C validator service (without having to go to the web)? I believe their validator is just a perl script that calls tidy somehow.

http://validator.w3.org/

Re:W3C Validator

LTjake on 2004-02-26T13:55:02

Hrmm, i don't think so.

Check the source page. No mention of Tidy. Also, it says "OpenSP is the SGML and XML parser used by the service". So i assume it parses the output from that.

Re:W3C Validator

sumdeus on 2004-02-26T13:59:45

I believe the W3C Validator does not use tidylib. It uses
use File::Spec          qw();
use HTML::Parser   3.25 qw(); # Need 3.25 for $p->ignore_elements.
.. and some other good stuff. The source is available. I would be very interested to know the similarities and differences between tidylib and this validation. [I could go look it up!]

Re:W3C Validator

Dom2 on 2004-02-26T15:18:39

The W3C validator requires an installation of OpenSP, which is a fairly heavyweight requirement.

I'm not sure quite what tidylib does, but I'm going to give it a play and see what it does. If it's faster than onsgmls, then I'm all for it!

Your other option for validation is to get libxml2 (in its perl form XML::LibXML) set up. The disadvantage (which it shares with OpenSP) is that it requires you to have all the catalogs for html/xhtml set up correctly. I'm assuming that tidylib has all that sort of stuff hard coded.

-Dom

Re:W3C Validator

petdance on 2004-02-26T16:40:05

There are CGI tidy interfaces out there, so you can see what tidy reports on. tidy also does cleanup on the HTML, and prettifies it for you, although HTML::Lint doesn't support that.