XML, I'm back

samtregar on 2003-05-10T02:11:38

It's time again for another XML project. I'm adding XML serialization/deserialization to a large object-oriented application. Last time I did this was during the Bricolage SOAP implementation and it was a mighty pain. This time I've chosen to cut out the SOAP and deal directly with files on disk. I've created a file format which is a TAR archive of .xml files and an index.xml file which describes the archive contents.

So far things are going well and the result should be complete on schedule and perform better than an equivalent SOAP system due to the lack of network overhead. Plus, the OO system I'm working on this time is much less complicated than Bricolage, so I've got less complexity to squeeze into XML. I also expect that not using SOAP initially will make the system easier to debug and easier for the system administrators to use. (I can dream, can't I?)

Another advantage is that Xerces C++ is now ready for runtime use as an XML Schema validator. For Bricolage I ended up only running the validator during testing since the only usable copy was in Xerces C++ CVS. Now, I still can't compile XML::Xerces, but the C API is working great. I really like how accurate the error messages for broken XML are. If I have the wrong element order or miss a tag the validator tells me exactly what's wrong and which line to look at. Of course, without XML Spy I would never have the time to actually write the schemas. I just wish they had a Linux version.

-sam


XML::Comma?

dug on 2003-05-10T05:29:40

I'm adding XML serialization/deserialization to a large object-oriented application.

XML::Comma may be a good place to look for a framework to help out with this. If you are looking for a flexible framework to get structured XML documents quickly between disk and memory (filesystem for storage, RDBMS for quick lookups of document locations), and to use a well documented and intuitive API for manipulating these documents, I recommend it.

There is an extensive users guide as well as a mailing list. There isn't XML Schema validation built into the parser, but Comma has sophisticated validation methods that are easily extended by embedding perl routines directly into the "document definitions" (think XSD type documents).

Ignore me if I'm totally missing it ;-)

-- dug

Re:XML::Comma?

samtregar on 2003-05-10T06:29:05

If you are looking for a flexible framework to get structured XML documents quickly between disk and memory (filesystem for storage, RDBMS for quick lookups of document locations), and to use a well documented and intuitive API for manipulating these documents, I recommend it.

Thanks, but I'm not. I already have RDBMS serialization/deserialization/querying working. The XML stuff is basically serving two purposes: external data import and inter-system data exchange. For the former purpose direct control over the XML schema for the documents is necessary. The Perl API isn't much of an issue (XML::Simple rawks) but the "XML API" (the XML Schema that external data feeds will have to write to) is.

There isn't XML Schema validation built into the parser, but Comma has sophisticated validation methods that are easily extended by embedding perl routines directly into the "document definitions" (think XSD type documents).

Maybe it's just me, but that sounds like a really bad idea. Yeah, on second thought, I'm sure it's just me. The rest of the world loves putting Perl code in their HTML, but not me!

-sam

Re:XML::Comma?

dug on 2003-05-10T14:56:29

Maybe it's just me, but that sounds like a really bad idea. Yeah, on second thought, I'm sure it's just me. The rest of the world loves putting Perl code in their HTML , but not me!


Heh. In Comma, "Document Definitions" are the control document for a collection of documents (or "store"). I don't quite see how ability to embed perl hooks into the control document is the same as intertwingling Perl and HTML. Anyway, sounds like you are headed in a cool direction. I'd be interested in hearing about any updates.

I haven't looked at the Xerces C++ parser in a while, and am itching for Schema validation. I'm off to check it out.

-- dug