unqualified XML schema

jozef on 2009-02-09T11:26:32

Last week I spent significant amount of time troubleshooting SOAP messages, checking XML messages and reading WSDL and XSD files.

The outcome is to always note this little option switch in XSD root element called - elementFormDefault="qualified". If not set this option defaults to unqualified. unqualified is only good and should be used when we need the XMLs with empty namespaces. When there starts to be namespaces always set elementFormDefault="qualified". This will ensure that all elements has their proper namespace.

Some examples to explain:

Imagine we want to have a schema for following XML:

<note> <text>huh?</text> </note>

Then the XSD will look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" > <xs:element name="note" type="noteText" /> <xs:complexType name="noteText"> <xs:sequence> <xs:element name="text" type="xs:string" minOccurs="0"/> </xs:sequence> </xs:complexType> </xs:schema>

So far so "unqualified" good. When xs:schema root element get's parameter called targetNamespace="http://justns/" then the valid XML looks a bit different:

<ns1:note xmlns:ns1="http://justns/"> <text>huh?</text> </ns1:note>

Note that "note" element has a namespace and "text" element has an empty namespace. And this was cause of my troubles. First thing XML::Compile needs an extra option "elements_qualified => 'TOP'" to work properly with this kind of mix. The second problem starts when encapsulating this XML in other XML like SOAP envelope. If the envelope sets an default "xmlns" then all the elements without namespace inherits the default one which is wrong.

Mark Overmeer say: "Unqualified is out of fashion: has no future.". He is right. Only for small XMLs where there will NEVER EVER be any namespace, the unqualified is useful.

Just for a completenes the example XML+XSD with qualified turned on:

<ns1:note xmlns:ns1="http://justns/"> <ns1:text>huh?</ns1:text> </ns1:note>

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://justns/" targetNamespace="http://justns/" elementFormDefault="qualified" >

<xs:element name="note" type="tns:noteText" />

<xs:complexType name="noteText"> <xs:sequence> <xs:element name="text" type="xs:string" minOccurs="0"/> </xs:sequence> </xs:complexType>

</xs:schema>


Namespaces aren't for humans

potyl on 2009-02-09T18:24:36

The way namespaces are implemented in XML are not too help full for us humans. An XML parser can keep track of the namespace of a node by checking the prefix used and going up in the DOM until it finds the namespace declaration or if there's no prefix by looking for the closest parent that defines a default namespace.

For a human this is very error prone, specially if the XML file is generated by a program and that namespaces are redeclared everywhere. This worsens if the namespaces get associated to different prefixes, which is perfectly legal. Trying to match the namespaces manually under such conditions can be quite tedious and error prone. Anyone that has parsed an Excel XML file by hand knows how difficult this can get!

The thing is that namespaces a vital and can't be avoided. So we have to learn to live with them. They become even more necessary when mixing different XML applications (SVG, XHTML, XSLT, etc).

This is exactly why I have created Xacobeo. The main goal was to help me display the XML document and the DOM tree as an XML parser sees it and not as it is declared in the file. For this latter purpose any text editor or pager (less and more) does the job. But,for for seeing the namespaces of the nodes as far as I know nothing existed.

If you look carefully you will see that Xacobeo always displays each XML node with the proper namespace prefix. No matter if the orignal node used a prefix or if it relied on the default namespace. This makes it trivial to see instantaneously the namespace of each node.