Bermuda XML

Ovid on 2008-01-23T10:16:06

For those who aren't familiar with Bermuda, it's an object serialization framework. Just write YAML config files (called "islands") and it writes the Perl code and the XML generation code. Eventually other output formats will be supported, along with RELAX NG generation and prebuilt test suites that you can drop your own fixtures into.

I went home last night and continued Bermuda work. I was planning on just relaxing, but I knew I probably won't have time to touch it again until next week.

Much of yesterday was spent paying off technical debt. I had spent a lot of "exploring" how to build Bermuda, but this meant that instead of passing around objects internally, I was passing around arbitrary data structures that were evolving along with my understanding. Now that I better understand those data structures, I'm slowly pulling them out and replacing them with proper objects. The code is becoming much cleaner, it's easier to use, and I get proper up-front validation.

So far, I can generate decent XML:



  
    John Smith
    js@example.com
    js2@example.com
    111
    222
    yes
  

And the island file is pretty simple:

---
package: My::AddressBook
island: address_book
attributes:
  version:
    type: positiveInteger
  hard-coded:
    data: 'this is from YAML'
elements:
  - card*
  - island: card
    method: cards

Note that the "card" element is a separate island, making it easy to organize things.

The only thing I find really annoying about this is that the "elements" are an even-sized list designed to simulate an ordered hash. YAML doesn't really support the idea of an array of pairs or ordered hashes, so I'm relying on the even-sized list. Here it is from the card island:

elements:
  - name
  - type: string
    attributes:
        some_attr:
            method: dummy_attribute
  - email*
  - type: string
  - phone+
  - method: phone_numbers
    type: string
  - active
  - data: yes

Can you tell at a glance if that is wrong? I can't and it's the format I designed! Each "even" element is a key and each "odd" element is the value (the element description). I'll be modifying the Bermuda::YAML parser to validate things like this, but for right now, it's a tiny annoyance. I'm just pleased with how well it's going.

Oh, and here's all it takes to render XML for any object you want to throw at it:

sub render {
    my ( $self, $island ) = @_;
    my $xml = XML::LibXML::Document->new( "1.0", "utf-8" );
    my $root = $xml->createElement( $island->name );
    $self->_render( $island, $xml, $root );
    $xml->setDocumentElement($root);
    return $xml->toString;
}

sub _render {
    my ( $self, $island, $xml, $root ) = @_;

    foreach my $attr ( $island->attributes ) {
        $root->setAttribute( $attr, $attr->value );
    }

    foreach my $element ( $data->elements } ) {
        my $node = $xml->createElement( $element->name );
        if ( $element->is_island ) {
            $self->_render( $element, $xml, $node );
            $root->addChild($node);
        }
        else {
            if ( my @attrs = $element->attributes ) {
                foreach my $attr ( @attrs ) {
                    $node->setAttribute($attr, $attr->value);
                }
            }
            $node->addChild( $xml->createTextNode( $element->value ) );
            $root->addChild($node);
        }
    }
    return $xml;
}

Not perfect, but not bad. Needs a bit of cleanup. I'm just disappointed that I won't have much time to work on it for a while.


alternative YAML

hdp on 2008-01-23T14:37:27

Would you be less annoyed by a list of single-key hashes?

elements:
  - name:
      type: string
      attributes:
          some_attr:
              method: dummy_attribute
  - 'email*': { type: string }
  - 'phone+':
      method: phone_numbers
      type: string
  - active: { data: yes }

Re:alternative YAML

Ovid on 2008-01-23T14:58:41

I like that. I do think it's clearer than what I had, but the existence of a hash implies that you can have multiple key/value pairs. Still, I don't think that's an expectation violation since these are island files with .bmd extensions and that means people should know they will need to read the docs.

What's wrong with it?

schwern on 2008-01-28T23:34:50

Color me stupid, but what's wrong with it? Or, put another way, what structure are you trying to express?

Re:What's wrong with it?

Ovid on 2008-01-29T00:00:59

What's wrong with what? If you mean the even-sized list, that should actually be an ordered list of pairs. That's something which Perl 6 can express, but not Perl 5 (at least, not cleanly). As a result, I need an even-sized list in the correct format (which sucks), or a list of hashes, each of which only has one key/value pair. I expect I'll go with the latter.

Or were you referring to something else?

Re:What's wrong with it?

schwern on 2008-01-31T19:21:27

I'm unable to envision the structure you're trying to express. I don't understand how the XML and the two YAML snippets relate to each other.

Some of the assertions confuse me because they seem to be wrong. Like "YAML doesn't really support the idea of an array of pairs or ordered hashes". They address this specifically in the spec. Wouldn't an array of pairs be:

    - key:   value
    - this:  that
    - up:    down
And why does it need to be ordered?

If you can show me what you'd like to express, in some made-up magic formatting language, I think I can show you how to express it in YAML. I'm going to take a stab and say you want this:

elements:
  - name:
      type: string
      attributes:
        some_attr:
            method: dummy_attribute
  - email*:
      type: string
  - phone+:
      method: phone_numbers
      type: string
  - active:
      data: yes
That's assuming the * and + things are part of the actual key name. It translates into:

{
  'elements' => [
    {
      'name' => {
        'type' => 'string',
        'attributes' => {
          'some_attr' => {
            'method' => 'dummy_attribute'
          }
        }
      }
    },
    {
      'email*' => {
        'type' => 'string'
      }
    },
    {
      'phone+' => {
        'type' => 'string',
        'method' => 'phone_numbers'
      }
    },
    {
      'active' => {
        'data' => 'yes'
      }
    }
  ]
};
If you can drop the ordering requirement it becomes even simpler:

elements:
   name:
      type: string
      attributes:
        some_attr:
            method: dummy_attribute
   email*:
      type: string
   phone+:
      method: phone_numbers
      type: string
   active:
      data: yes
which translates to

{
  'elements' => {
    'active' => {
      'data' => 'yes'
    },
    'name' => {
      'type' => 'string',
      'attributes' => {
        'some_attr' => {
          'method' => 'dummy_attribute'
        }
      }
    },
    'phone+' => {
      'type' => 'string',
      'method' => 'phone_numbers'
    },
    'email*' => {
      'type' => 'string'
    }
  }
};