Why I Don't Like YAML

Ovid on 2009-07-31T09:21:57

Why don't I like YAML? Well, have you read the spec? It's so awful that most YAML parsers disagreed about what YAML was. If I recall correctly, it was Why's libsyck which effectively set the default YAML standard. That doesn't mean that having a standard is a good thing, though (Prolog has an ISO standard which is universally ignored because it breaks Prolog). For example, do you know what this is?

---
file: Yes

Let's see what the three most popular Perl modules do with this.

#!/usr/bin/env perl

use strict;
use warnings;

use YAML;
use YAML::Syck;
use YAML::Tiny;
use Data::Dumper::Names;

my $yaml_string = <read_string($yaml_string);
print Dumper $yaml, $syck, $tiny;

And they all output:

$yaml = {
          'file2' => 'off',
          'file1' => 'Yes'
        };
$syck = {
          'file2' => 'off',
          'file1' => 'Yes'
        };
$tiny = bless( [
                 {
                   'file2' => 'off',
                   'file1' => 'Yes'
                 }
               ], 'YAML::Tiny' );

It's a shame they do that because those are not strings, they're boolean values. In short, the most popular YAML parser all ignore the spec. I won't fault YAML::Tiny because it's supposed to be a subset of YAML. I also won't fault YAML::Syck because it's a wrapper around libsyck. YAML.pm, on the other hand ...

So big deal. Who cares if we're violating the spec? This Ruby programmer does. The parser he's using follows the spec, but the Perl and Python generators don't properly quote the boolean. And why should they have to build in a special case for yet another string? And if you read the spec for the booleans in YAML, it's almost case sensitive, but not quite. "False", "FALSE" and "false" are all false, but "FalSe" is a string, which ironically would be interpreted as true in Perl.

Just try and read the spec. You probably won't finish it. And do you want to see a grammar for YAML? Apparently there's one hidden in the spec and you can install Schwern's Greasemonkey script to see it.

Oh, and if you really want to have fun, trying playing around with anchors and see if you can make 'em recursive and see how various parsers fail to handle it. I don't want a mega-spec. I don't want something which has all sorts of special meanings which different implementations fail on that I have to keep track of. I don't want the One True Way which would be able to serialize everything if only there were parsers to handle it.

JSON. One Page. Done. Any questions?


JSON

moritz on 2009-07-31T11:35:31

JSON is much simpler, but to be fair it only supports a subset of what YAML does. For example it's not good for serialization, because you don't have any way to specify type names.

Anyway, JSON being much simpler is why I wrote a JSON parser and generator for Perl 6 (with some help from Johan Viklund), see http://github.com/moritz/json/. As far as I can tell it implements JSON to 100%.

Re:JSON

Ovid on 2009-07-31T12:46:29

The fact that JSON only supports a subset of what YAML does is why I really like it. I know YAML is powerful and feature-rich, but JSON is much more predictable.

Re:JSON

Aristotle on 2009-07-31T13:09:26

Somehow I don’t think that trying to be a serialisation format – for several languages with significant differences – and human-readable and -writeable, all at the same time, is an argument in favour of a format.

Re:JSON

Matts on 2009-07-31T13:27:53

I had this argument with Ingy way back when he was coming up with YAML. He made it over complex (IMHO), resulting in a spec which made XML easier to implement than YAML (32 pages of spec vs nearly 200), which makes no sense since YAML was supposed to be EASIER to read and write than XML.

It's a bit more human readable than XML, but not much, and edge cases like this are going to make it a LOT harder to debug an issue with YAML than with XML.

Will this crap never end

Alias on 2009-07-31T11:53:00

Until your post, I'd never discovered the part of the specification that isn't shown in the actual specification that describes those special strings, so I'd never known to implement them.

2 hours later, it's now fixed. YAML::Tiny will always quote the strings listed in those secondary type specifications.

BTW, it's not case insensitive.

It's apparently ( $str eq uc($str) or $str eq lc($str) or $str eq ucfirst($str) )

Still ugh though.

Re:Will this crap never end

Ovid on 2009-07-31T12:45:14

I'll be you'll break a lot of code with this. I've previously worked with a module (can't recall, but I think it was SOAP related) which was guessing my data types and silently converting things for me. No end of headaches :(

I didn't report this to you because I assumed you had deliberately kept things simple :)

Re:Will this crap never end

Alias on 2009-07-31T13:20:34

The YAML-Tiny language specification does not contain support for these strings, and so the PARSER half of YAML::Tiny (and thus Parse::CPAN::Meta) will not support it.

However, in emitting YAML it's quite appropriate that I make sure to avoid causing problems for others.

So all I'm doing is detecting that the emitted string in a hash or array is one of these magic strings, and then escaping them (when I previously wasn't).

Re:Will this crap never end

Ovid on 2009-07-31T14:07:00

Ah! That makes sense. Thanks :)

To each his own...

sigzero on 2009-07-31T16:37:34

For document interchange stuff I use XML and for data interchange stuff I use JSON. That works for me the majority of the time.

implicit typing in Y::Syck, Y::XS

davegaramond on 2009-08-04T11:39:21

As a user (i.e. not a parser writer) I still very much prefer YAML over JSON most of the time due to it being more readable.

But yeah, unfortunately many of the earlier Perl YAML parsers do not behave in a standard (YAML-ish) way, not to mention used to crash a lot too.

For YAML::Syck, there's $YAML::Syck::ImplicitTyping = 1; which IIRC was turned on by default in some previous release but then Audrey reverted it back to 0 to avoid compatibility break.

How about everybody just use YAML::XS, then?

Wrong version of the spec.

schwern on 2009-08-11T10:06:22

The boolean documentation you link to is from the 1.1 spec. Its a bit insane to have that many special values. 1.2 took a lawn mower to it and now there's just two bool values, "true" and "false".

YAML::XS gets it right. YAML.pm is being gutted.