[an error occurred while processing this directive]

A report about a tool to write XML with Unotal [] (Writing XML with Unotal), Report, page 721968
http://www.purl.org/stefan_ram/pub/unotal_writing_xml (canonical URI).
Stefan Ram

Writing XML  with Unotal

Writing XML with Unotal ” describes how XML -documents can be described in Unotal, which allows for some workarounds to problems with XML. These documents then can be converted to XML.

A Small Introduction to Unotal

Unotal  was developed to be able to represent data in a natural way. A very coarse description of Unotal  might be: It is XML  with structured attributes. There are also no end tags for elements: An element always starts with a left angle bracket and ends with a right angle bracket and actually it is called a “room” in Unotal. The type of a room is marked with an ampersand "&".

A tower description in Unotal
< &towerdescription
height=<40 &meter>
name=<[Miller tower] &text>>

A string may be written as it is, but must be enclosed in brackets if it contains special characters, such as a blank. The position of the room type within a room is not fixed, so that the room "<&meter 40>" might also be written as the room "<40 &meter>".

Unotal  gives the freedom to the writer to always use attribute names for property names and element type names for data type names, because:

In Unotal, attribute values might be structured.

Thus, it can be told with a wink of the eye, what's a property name and what's a type name.

Writing XML  with Unotal

Because Unotal  allows a more natural notation of data, one might want to write documents in Unotal  instead of XML  and then have them converted to XML. For example, a natural tower description can be given as follows.

A tower description in Unotal
< &towerdescription
height=<40 &meter>
name=<[Miller tower] &text>>

An XML -language (DTD) might require the description as follows.

A tower description in XML
<towerdescription name="Miller tower">
<height><meter>40</meter></height>
</towerdescription>

Above, the "height" semantically is not  a type of the element, but the name for the relation between the towerdescription and the value "<meter>40</meter>", i.e., the rôle of that value. But in XML, this can not be written with the proper means, i.e., as an attribute, so an element has to be abused for that purpose.

Because in XML  attribute-values may not be structured, in XML  one has to abuse element types for property names, which actually are not types  of one thing alone at all, but names for the relation  between two things.

How should the converter know, that the name attribute has to be converted to an XML -attribute, but the height attribute has to be converted to an XML -subelement? A special description file might provide these information. But for a recent project another approach was chosen, where the writer needs to know something about the XML  structure, but can still express whether a name is a property name or a type name.

A tower description in Unotal  for XML -conversion
< &towerdescription
name=[Miller tower]
height-<<40 &meter>>
>

Here the hyphen  (to be pronounced as “is”) is used as a replacement for the equals sign, but it includes the “hint” to the XML -writer that the property is to be written as an XML -subelement instead of an XML -attribute. By this means, it still can be expressed that the name on the left hand of the hyphen  is a property name and not a type name. So, property names are written on the left side of an equals  sign or a hyphen , while type names are written as type names. The property names written with an equals sign  are converted to XML -attributes and the property names written with a hyphen  are converted to XML -subelements with the property name as an element type name. The double angle brackets are required only if a type is specified, because this has to be translated into "<height><meter>40</meter></height>", i.e., into two elements, where the property name given is used as the type of the outer XML -element.

A note on the implementation: The hyphen  used in this way does not have any special meaning for Unotal , which the equals  sign has. So the above tower description in Unotal  is a room with a type "towerdescription" and an attribute "name". It has three entries in its body, i.e., the entry "height", the entry "-", and the entry "<<40 &meter>>". The hyphen  will be interpret by the XML  writer: If the right hand side is a room, the left hand side will become its type and the hyphen  will be removed, so the XML  writer will convert the above tower description to the following tower description. So the intermediate representation of the text "height-<<40 &meter>>" will be the room "<&height<40 &meter>>", which then is converted to the XML -element "<height><meter>40</meter></height>".

An intermediate representation of the room given above
< &towerdescription
name=[Miller tower]
< &height <40 &meter>>
>

This modified tower description than can be converted to XML  straightforward by writing Unotal -attributes as XML -attributes and Unotal  rooms as XML  elements.

A tower description in XML  generated from the above Unotal -description
<towerdescription name="Miller tower">
<height><meter>40</meter></height>
</towerdescription>

Here is an example of an XHTML -document written with Unotal. Some attributes that require special treatment by the converter have names beginning with a dot ".".

An XHTML -document written in Unotal
< &xml

  .xmldecl = [version = "1.0" encoding="UTF-8"]

  .doctype = [html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"]

  html -

  < xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"

    head - < title - [Virtual Library] >

    body - < &p [Moved to ]
< &a href=[http://vlib.org/] [vlib.org] > [.] >>

One can see that the head  attribute and the body  attributes of the html  element can be written as what they are: As attributes  of the html  room. This gives a clear contrast, so that the paragraph  type "p" and the anchor  type "a" stand out as what they  are: as types  of their rooms. By the brackets it is obvious which spaces (blanks) are a significant part of the text and which are not (i.e., the spaces inside of brackets are a significant part of the text, the spaces outside of the brackets are not).

Also, the text "head - < title - [Virtual Library] >" (where the hyphen  is pronounced “is”) seems to be quite readable and writable, compared with the text "<head><title>Virtual Library</title></head>".

The hyphen  notation allows to repeat properties with the same name and can retain the order of property definitions. Both can be important when creating XML  documents. (The order of the Unotal  attributes is not significant and thus might get lost while processing it.)

Multiple occurrences of properties with the same name and a specific order in Unotal
< &article
keyword-<alpha>
keyword-<beta>
>

Another Example

Another XML -example
<web-resource-collection>

  <web-resource-name>User Section</web-resource-name>

  <description>no description</description>

  <url-pattern>/protected/*</url-pattern>

  <http-method>POST</http-method>

  <http-method>GET</http-method>

</web-resource-collection>
The preceding example written in Unotal
web-resource-collection -

< web-resource-name - [User Section]
description - [no description]
url-pattern - [/protected/*]
http-method - [POST]
http-method - [GET]
>

About this page, Impressum  |   Form for messages to the publisher regarding this page  |   "ram@zedat.fu-berlin.de" (without the quotation marks) is the email-address of Stefan Ram.   |   Beginning at the start page often more information about the topics of this page can be found. (A link to the start page appears at the very top of this page.)  |   Copyright 2004 Stefan Ram, Berlin. All rights reserved. This page is a publication by Stefan Ram. slrprd, PbclevtugFgrsnaEnz