XML, SGML, XSLT, DSSSL, Oh my!

jjohn on 2002-01-15T18:12:18

Ill-conceived rant mode: on.

Gods. It's no wonder at all the XML/XSLT aren't getting traction. It's hard to find non-broken tools to make document processing a reality (the docbook folks make changes that the Fop group can't handle breaking the whole processing chain). But the tools are there, you say? Let me be clear: HTML is rendered in a web browser. Hell, even a crappy web browser like lynx is good enough to start learning HTML. As a user, I get immediate feedback on my markup. With XML and XSLT, there should be an equally easy way to do get pretty feedback. I've been reaching into the depths of docbook (both SGML and XML [don't ask why]) like a veterinarian delivering a calf and it's been just about as pretty. DTD management is insane and the tools (jade for DSSSL and xsltproc for XSLT and then Apache's Fop) are one step up from hacks. This is crazy. If I can't get PS from docbook easily, I should just stick to latex or even POD. After much fighting and reading and reading and reading, I have figured out how to use these tools but things have got to get much better in 2002 for these tools or XML will soon go the way of punch cards.

For the record, I don't get off on XML at all (unless people are buying my book). I care about what works, what's easy and what's available.

Grr. Grr, I say!


XML already is going the way of the punchcard

hfb on 2002-01-15T18:18:23

It has been around how long now and how many people have actually figured out how to use it and what to use it for? Stick with TeX and LaTeX and wait for the fad to blowover.

Re:XML already is going the way of the punchcard

Matts on 2002-01-15T20:16:29

OK, if you're really not trolling...

Lots of people are using XML. Lots of people are using XML successfully. More importantly perhaps, lots of people are using XML unsuccessfully because they have overblown expectations of it (a.k.a. hype).

But XML, in the document management field and elsewhere, works. I can write a book for O'Reilly in DocBook, and they can chuck it through their processing toolset and have a book come out the other end. I can use SAX tool suites to process DBI and Excel files the same way I do plain XML files. It's a powerful technology for data processing, and I think some people are missing out because all they see is angle brackets rather than a data structure.

Re:XML already is going the way of the punchcard

hfb on 2002-01-15T20:47:39

Power is often passed over for simplicity as nature favours the path of least resistance...this would not be XML. DTDs are stupefying and the mess of CPAN modules is beyond even my ability to figure out which goes where when and how. I have lurked for many moons on the XML list and I have yet to really understand who and what are in which ring of the circus.

I don't think people have lofty expectations I just think they're as confused as I am as to what this blob of stuff is really supposed to do and how to use it to do whatever 'it' might be. I still use TeX and I use Frame which can generate XML or SGML if needed but I've not had any cause to use it.

I think XML would have a lot more success if it wasn't seemingly all over the map, if DTDs seemed sane and if there was some focus instead of a lot of different people saying a lot of different things about what XML is and why the lot of confused people such as myself might give it a second thought. XML gives me that same feeling punchcards did...there's got to be something better than this.

Re:XML already is going the way of the punchcard

Matts on 2002-01-15T20:59:03

"Power is often passed over for simplicity as nature favours the path of least resistance..."

Which is why a lot of this is a good thing. To be able to learn a single toolset (XML processing) and apply it to DBI or Excel spreadsheets, or whatever you may please, is a simple thing and a good thing.

Re:XML already is going the way of the punchcard

hfb on 2002-01-15T21:39:03

XML is not simple. That's the problem. Have you tried wading through the XML mess on CPAN lately? Nothing seems at all simple about that.

Re:XML already is going the way of the punchcard

Matts on 2002-01-16T07:06:55

OK, ok.

I fixed the SAX "problem", now it's time to fix the modules problem.

(note: you could say the exact same thing about all the templating modules, or all the DBIx modules, but nobody bitches about that. Sheesh)

Re:XML already is going the way of the punchcard

vek on 2002-01-19T02:12:45

...and the mess of CPAN modules is beyond even my ability to figure out which goes where when and how.

I think it's a little unfair to pick on the XML modules for not being intuitive. Have you tried doing a search on Apache lately? Holy clusterfuck Batman...

I agree

Matts on 2002-01-15T19:29:51

To an extent I agree. The docbook processing tools are crappola. Really they are bad. But I can't really agree with the XSLT tools. Simply stick a <?xml-stylesheet?> directive in the top of your file, and either deliver it via AxKit, or view it in mozilla or IE, and it'll be beautifully rendered for you.

As far as XSLFO goes, the people I know who write books with it use PassiveTeX, not FOP, because FOP's output still sucks (i.e. doesn't use TeX formatting rules). However PassiveTeX is even harder to install, and you're still left in a maze of DTD's, catalogs, and TeX installation buglets. Gah!

I (strangely) have high hopes of these things getting better, particularly this year in the Perl arena where we've made some pretty massive advances in processing XML. I hope I'm not proved wrong.

Oh, and hfb - drop the trolling. XML is a good thing, and is here to stay. There's no equivalent of docbook out there in the "TeX" world. TeX is a publishing markup language, and docbook is a technical publication markup language. They are different things, period.

Re:I agree

hfb on 2002-01-15T20:00:34

I wasn't trolling actually...

Re: Splitting hairs

jjohn on 2002-01-16T00:24:08

TeX is a publishing markup language, and docbook is a technical publication markup language.

The point is ASCII is easy to author in (at least I think so). Book printers need PS or PDF to print. Here, the source code (TeX or XML) is a convenience to the authors. The real world needs the information in a format usable by their tools. TeX isn't at all a general replacement for XML, but in the realm of publishing, it's often a better fit than docbook.

Let me be plain: I want XML and docbook to succeed. I do get it. But, there is an unfortunate (and mystifying) gap between the promise of XML in general (and docbook in particular) and the reality. It is my supreme hope that that gap narrows.

Re: Splitting hairs

Matts on 2002-01-16T07:08:45

DocBook's biggest problem is over-complexity IMHO. I think most people would be better served with s-docbook (simple-docbook).

But yes, we agree.

DocBook

darobin on 2002-01-15T20:17:13

Nice, you've found out that DocBook sucks. Hmmmm. How you get from there to the fact that XML sucks is a bit beyond me though.

In XML, you don't need DTDs. The fact that you chose to inflict those horrors on yourself is, well, not a problem with XML :) Also, XSLT are very much there and very mature, I fail to see how you managed to have a problem there.

A few clarifying points

jjohn on 2002-01-16T00:15:53

I'm happy to have provided some talking points and perhaps a minor distraction from Bigger Things with my rant. Lest hfb get beat up too much for agreeing with, let me make the following points:

  • XML as Rosetta Stone XML excels as a medium of data exchange. That is, XML makes the process of moving information from one application into another much simpler. I had a job at CareerSearch in which I had to deal with DBF, CSV and fixed positional ASCII files of all kinds. If the data had all been in XML, it would have simplified many problems (although not eliminated them). XML's hype is fading as people learn what it's good for and, more importantly, what it's not good for.
  • SQL not XML Quering XML docs with DBI seems daft to me. Get the data into an SQL system where God Almighty indeeded it to be (unless the kind of queries are trivial). That's why Him invented indexing for databases.
  • Flat-chested tools My gripe is less with XML proper and more with the lack of well developed docbook tools, hence my pining for a "docbook browser." Abiword does support docbook to some extend. I wonder if it supports xsl stylesheets?
  • Think of the end-users O'Reilly's internal tool support for docbook is less developed than their support for MS Word and Framemaker. There are folks in the Tools group there that can deftly handle docbook, but copy and production editors need better WYSIWYG tools.
  • Hey, look! A wheel! I want to write documents in XML and have them magically turn into manpages, HTML or PDF. That's not too much to ask; many other text-based markup languages have been doing that for years. I'm greatly disappointed that the XML crowd seems to be reinventing the wheel on this one.
  • Perl leads the way Life is very, very short. Too short for the overhead of managing DTD files and catalogs. That's exactly the kind of madness that C programmers face (think: header, lib and shared lib files). DTD management needs to be more like CPAN. DWIM.

Of course, I'm really just blowing off steam. There are good uses of XML and I have hope that the tools will mature this year.

Re:A few clarifying points

jjohn on 2002-01-16T12:42:07

Sorry about the typos and what not in the last post. My net connection was getting flaky and I couldn't risk a "review". Hey Nandor, how's the XML-RPC interface to slash coming? I could have writing this in emacs... :-)