XML::Yggdrasill and Other Reasons We Called It "PerlDOM"

kingubu on 2001-11-25T03:10:53

Couple of random thoughts about the PerlDOM project and what I hope it will be:

Swappable on the front end

Okay, you have a nice shiny DOM tree-- how do you access its contents? Every single DOM implementation out there to my knowledge (and not just in Perl, either) gets the damn thing backwards in my opinion. They start with the W3C DOM spec and then code the underlying data structures to make that interface work. Wrong, wrong, wrong!!!! (okay, maybe not "wrong" but certainly sub-optimal) The W3C DOM is only one possible interface. Tis a far, far better thing to have a more generic, document object model (probably based on the XML Infoset) and then make the interfaces work over that node structure to provide access to the document's contents.

It is my intention and strong desire that PerlDOM will, for prolly the first time anywhere, provide a generic tree-structure upon which developers can map their own interface classes to provide access. Not only that, I want users to be able to choose from within their application which installed interface to use, and be able to add extra extension methods without having to re-build the node tree.

Want XPath? Load the interface for it; the underlying tree stays the same.

Want W3C DOM Level 2 with methodNamesLikeThis? Load the interface for it; the underlying tree stays the same.

Want a simple lighter-weight Perlish DOM with method_names_like_this? Load the interface for it; the underlying tree stays the same.

Want extra features like getElementsByNameRegex (oooo!)? Write the interface for it and load it; the underlying tree stays the same.

I'm sure you get the idea...

The problem we are seeking to solve here is the sad state that we have now where everyone and his brother creates YA-incompatible-with-anything-else tree-shaped data structure every time they want a new DOM interface. Better that we have a stable, predictable tree that supports multiple interfaces that can be passed around by various modules, rather than the sucky toString->reparse Hell that we live in now.

Swappable on the backend

This is the other real challenge in my vision of PerlDOM; making the One Tree work over the myriad of possible data representation of the node tree. It has been suggested that a tree can be stored in many more ways than just an in-memory HoHes and it would be verra nice to allow for alternative storage. The implication here is that PerlDOM would "just work" with the data trees built by existing implementations (XML::DOM, LibXML, Sablotron, etc.) so you could just pass those trees in and have access to all the other PerlDOMy Goodness.

I'm not 100% sure that this is really feasible, but it would be damn nice and I'm open to suggestions. Note: this is the job of the, um, NodeFactory(?) classes?

Able to finally, concretely, differentiate in the minds of XML Geeks everywhere the difference between a document object model and an implementation of the W3C's Document Object Model

Okay, I kinda covered this in the "swappable interfaces" part, but I want to underscore the point. The W3C DOM is only one possible document object model. That is, given a tree of nodes, it describes one set of methods and conventions that describe the relationships between those nodes and how the nodes are accessed by an application. Part of the goal of PerlDOM in general is to provide a predictable tree structure upon which the W3C DOM and other, alternative DOM interfaces can be mapped. If when someone says "DOM" you immediately think they mean "W3C Recommended DOM", check yourself.

-ubu