Using XML::LibXML, I'm trying to determine if two nodes are different. I have the following humiliating code, but it works.
sub _node_differs { my ($self,$left,$right) = @_; # toStringC14N ensures "canonization" and allows '' to match # ' ' my ($left_string, $right_string) = map { defined($_) ? ( eval { $_->toStringC14N } || $_->toString ) : '' } ( $left, $right ); if ($left_string ne $right_string) { return [ $left, $right ]; } return; }
What's a "proper" way of doing that? I know I'm missing something obvious.
Apart from the weird 'eval' (toStringC14N is defined in XML::LibXML::Node, so anything that's part of a XML::LibXML document should have it), it doesn't look "humiliating".
You can create a quotient of all XML (text) representations by putting in the same class those that produce the same DOM (internal) representation; you want to see if two nodes fall into the same equivalence class. Canonicalization is a way to produce a representative of the class, so comparing two canonicalized representation of DOM subtrees seems right enough to me.
In other words: it's a way to define equivalence, and probably saner than most.
Re:Looks proper enough…
Ovid on 2008-08-19T15:50:08
The eval is there because I kept hitting nodes which would have that method defined, but I get this error:
Failed to convert doc to string in doc->toStringC14N at
/opt/csw/lib/perl/site_perl/XML/LibXML.pm line 955 I don't know what's causing that, but the eval protects against it.
Re:Looks proper enough...
dakkar on 2008-08-19T16:00:13
Looking through the source code of libxml2, it looks like it's caused by nodes that have no textual content: the xmlC14NDocDumpMemory function (used by toStringC14N) does not set its "output buffer" parameter if the buffer would be empty.
Strange behaviour, I'd call it a bug, or at least a documentation deficiency.
Re:Looks proper enough...
Ovid on 2008-08-22T09:18:05
Thanks for digging into this. I've included the URL of this discussion in our code base so people have a clue what that weird eval is for.