When is Simple Not Simple?

chromatic on 2003-08-05T05:55:18

I like when CPAN modules have simple interfaces. I like to use shell aliases and will never give up the command line. I like things that make my life easier.

If you want to make my life easier, do not make tools that require me to write XML to use them!

To help you remember this rule, I have written a simple song:

when you find yourself hacking
a tool you've been lacking
you have one great choice to make

make it painless to write
skimp docs, tests, and the like
"to use it, just read the source code!"

or make it easy to use
make a good API
and remember the best rule of all

XML, XML, XML
it's not for configuration files!
XML, XML, XML
hard to write! hard to read! angle braces!

"easy for me!" may be hard for your users
so they'll curse your name in their frustration
and wish for something like YAML

so work a little harder
and write a smarter parser
and everyone will think you're swell

The moral of the story is, "Don't annoy a music major."


No it's not!

mir on 2003-08-05T07:46:08

What do you mean XML is not for configuration files? See how elegant the XML version is compared to the YAML version?

The problem is that it is really hard to go from one to the other:

perl -MXML::Simple -MYAML -e'print Dump (XMLin( $ARGV[0]))' config.xml > config.yaml

(modulo forcearray and other option problems of course)

;--)

we could get together

hfb on 2003-08-05T07:51:21

and write "CPAN: The fucking musical rant!", a musical spanning several hours and a finale of encrypted music though everyone in the audience would have the words....XML is the least of it :)

"Hard to read"

Matts on 2003-08-05T11:38:00

Depends what you mean.

XML is easy for a computer to read. And it's easy for a programmer to write an interface to.

That's perhaps where the problem you perceive lies - it's almost too easy to punt and opt for XML rather than some custom config file.

YAML kind of gets rid of that problem, but it brings along a bunch of other problems, such as lack of tools for languages other than Perl (especially C and Java), so we end up isolating ourselves from other programming communities. Yes, the YAML spec is open and anyone can implement a parser. But the Java and C parsers are still works in progress, and don't allow as easy access to the data as the XML libraries for those languages.

Perl hackers complain far too much about XML. It's ironic really, given how hard to type perl code is (all those sigils). I don't dislike YAML. I just don't think it's the panacea that perl hackers have been convinced it is. And now I see even the perl core is opting for this non-standard data format (that we'll likely include a YAML parser in the perl core before we include an XML parser is just the dumbest thing ever).

People who write apps that store their configs in XML should not be criticised - they're doing exactly what we want them to do - make their config files process-able by standard tools. That's no bad thing, and you shouldn't complain about it IMHO.

Re:"Hard to read"

rafael on 2003-08-05T11:58:22

And now I see even the perl core is opting for this non-standard data format -- no, PAUSE is opting for it.

we'll likely include a YAML parser in the perl core before we include an XML parser -- I don't think so. But consider that there's only one YAML parser, and multiple XML parsers : selecting one of them would probably produce endless wars.

Re:"Hard to read"

Matts on 2003-08-05T12:42:28

We in the Perl XML community already solved the "multiple XML parsers" war problem when it was last raised by the perl community. It's a dead argument now. Just install XML::SAX and be done with it.

Re:"Hard to read"

jjohn on 2003-08-05T16:52:24

XML is easy for a computer to read. And it's easy for a programmer to write an interface to.

If I recall correctly, XML was designed so that hyoo-mons could read it. The "computer", that is to say programmers, can munch any digital data format you choose. XML is still a bear to deal with. Subclassing parsers or writing callbacks for event-driven parsing is not particularly straight-forward for many programmers. This is not to suggest that I pine for the days of random ASCII formats (like YAML), but I don't think XML is as good as it can be. XML is a good solution for data interchange and frankly, that's a huge win for many, many programmers.

So, I guess my point is: let's keep looking for ways to represent data in a platform neutral-way. (Sadly, I'm beginning to think that Comma Separated Value files might not have been so bad after all).

Re:"Hard to read"

Matts on 2003-08-05T17:06:53

I hear you on the event driven parsing front.

I'm writing a pull parsing module for XML right now that gives you nodes that are the same as SAX2 nodes. Ultimately a pull parser is probably a better low level parser as it's easier to write an event driven parser on top of a pull parser than it is to do it the other way around.

But all of this reminds me of Mirod's call to not use low level APIs to read XML. If you want to just access bits of an XML document there's an awesome syntax available to do that: XPath. Alternatively use XML::Simple. I'm not sure why people still revert to using XML::Parser (is it really just the name?)

Re:"Hard to read"

koschei on 2003-08-09T06:24:31

There's the name, and that the docs to XML::Parser don't say: "Go use XML::SAX. Use of XML::Parser is deprecated.".

Re:"Hard to read"

Matts on 2003-08-09T13:04:23

They will in the next release :-)

Re:"Hard to read"

koschei on 2003-08-09T13:10:20

Excellent. matts++

Re:"Hard to read"

chromatic on 2003-08-05T17:13:26

People who write apps that store their configs in XML should not be criticised - they're doing exactly what we want them to do - make their config files process-able by standard tools. That's no bad thing, and you shouldn't complain about it IMHO.

Expecting users to write configuration files in XML is wrong. Maybe, someday, when usable XML authoring tools are widely distributed, the situation will change. I do point out that user-hostile file formats such as that of sendmail.cf also needs a slightly-nicer front end so mere mortals can use it.

I'm fine with XML as an interoperability format between programs, where users don't have to write it. If you want to use it as a serialization format, where users don't have to write it, that's also fine.

If the first thing a user has to do to use your tool is to write XML, you're being insufficiently lazy. I'm all for standard tools, and XML gets points there, but making it easy for the computer at the expense of the user is the wrong approach.

Re:"Hard to read"

mir on 2003-08-05T17:37:03

Well, to be honest you don't have to write the XML by hand. Just create the data structure you want any way you want (through a dedicated GUI for example) and let XML::Simple dump it as XML. This also works for YAML, BTW.

Re:"Hard to read"

ziggy on 2003-08-07T17:17:22

Expecting users to write configuration files in XML is wrong.
Actually, I think your expectations here are wrong. Or at least your POV.

One of the benefits to XML is its anglebrackety syntax. Bemoan it all you want, but by the time XML came around, the world had lovingly embraced HTML. For all of its warts, give someone a copy of Notepad.exe and they can start writing HTML. And XML. And perl.

What I think you're harping about is the data/document duality. XML as originally envisioned is much easier to handle on the document side of the spectrum. Things get nasty when you add all sorts of strictures (like those necessary in config files) and expect users to hand-code data structures for you.

Asking users to hand-code data structures in notepad or vi is the problem, not XML. XML for data has gone one step backwards -- instead of creating grammars that can be forgiving and flexible, we're adopting the strictest of the strict XML vocabularies and processing them in the least liberal manner possible. Whenever there's an error at the syntactic or the semantic level, things break, and they break hard.

We're back to sendmail.cf all over again, but with "standardized processing tools" to ease the pain.

I could not agree less...

brianiac on 2003-08-05T15:30:37

I find XML to be quite readable, and much more friendly and powerful than most alternatives (though I have not played with YAML). Frankly, I find it amusing to see someone complaining about XML readability in a Perl forum. ;)

What really bothers me is learning another format, and installing another parser, for a grammar that turns out to have severe limitations, and eventually has to be changed or replaced.

Take Mozilla's search plugin syntax for example: a clone (apparently) of the syntax used by Apple Sherlock, it looks similar to XML, but is definitely not! "Attributes" (used to define delimiting sequences) cannot contain spaces or character entites, a significant problem for parsing web pages; plus, no language designation can be embedded in a file, so each plugin requires a separate configuration file for each available language. Mozilla's not-quite-HTML-not-quite-XML bookmarks file is also deeply irritating to work with.

(Don't get me wrong, I love Mozilla, but primarily for standards support (though they have yet to completely implement HTTP, HTML, XHTML, or CSS2), so it bothers me more when it eschews standards (especially ones already implemented in it, like XML) in favor of strange proprietary stuff.)

If it turns out I am in the minority, and XML is simply too darn hard to puzzle out for most (though it was specifically engineered to be human-readable), perhaps the best solution is for programmers to provide a configuration screen or config generator utility, leaving their choice of back-end to the best technical fit.

Utilities! Yes!

chromatic on 2003-08-05T19:13:38

Yep, that's the best solution. As long as it's possible to configure the program without writing or editing a difficult file format, I'm reasonably happy.