XML at Work—Introduction

Axel Kramer & Patricia Hallstein, 1999-05


  1. What is XML?
  2. How does XML fit into the Web-World?
  3. Why would one want to use XML?

1. What is XML?

XML is a textual language for representing data. Its heritage is twofold: on the one hand SGML, a general markup language used for document processing, and on the other hand the web with its ever growing demand to present, process, and interchange more sophisticated content. In some sense one could say XML is a meta-language: it allows users to design their own languages, on the basis of a common syntax and an (albeit simple) description of the language specific semantics.

In contrast to other mechanisms which represent data textually (e.g. tab-delimited text files), data described using an XML language is more self-descriptive: each data element is wrapped in element tags which identify the kind of data it is, furthermore the file as a whole, or groups of elements, can be designated to adhere to a particular grammar. This is currently done with DTDs (document type definitions). Those are not very expressive, and there is a movement within the XML community to replace them with a more advanced schema definition language which can capture more of the symantics of an XML language.

This basic idea of XML is surrounded by a number of other core concepts, some of which are in the early stages of standardization. The most important ones, beyond the already mentioned schemata languages, are: name-spaces for modularization and extensibility, XLink and XPointer for expressing networked and distributed data structures, and XSL (extended style language) for transforming and reordering data represented in XML. Once all those core concepts become stable standards, designers and implementors of information spaces and web centric systems have a powerful framework in which they can realize the requirements of weblications and information systems.

2. How does XML fit into the Web-World?

XML can be used on the server or on the client. Its usage on the server can be transparent to the server itself, that is XML can be transformed through some process into HTML statically when required (e.g. when XML data changes). In this case the server serves regular HTML pages, not knowing about XML at all. Another way to use XML on the server is to generate HTML dynamically from XML. A Java servlet, or a some CGI executable, takes either XML alone, or XML and XSL and produces HTML as the result of an HTTP request. This approach is useful if the XML data changes rapidly, or if the permutations of required HTML pages would be to large for the static approach. In both cases the data itself can originate in object oriented or relational databases which support XML directly, it can be stored in regular files, or it can be created on the fly.

On the client XML can be processed either through a browser or with standalone applications. The current browser generation supports XML only in limited ways, but both, Netscape and Microsoft promise better support in the next versions. Internet Explorer 4.0 has some built-in support for XML: an XML parser, an XML object model which can be accessed from Visual Basic or JScript, as well as a data binding facility, in which XML files can be used as a data source for HTML table generation. Internet Explorer 5.0, although currently only available in a preview release, promises additional features like XML data islands (which allows direct embedding of XML into HTML files), application of style sheets to XML elements, and an adherence to the W3C document object model for HTML and XML.

The dependency to browser versions can be reduced by using Java applets for processing and presenting XML files. This does have the disadvantage that the inherent browser features for text and table layout can not be used in a platform independent way (until all browser suppliers support a uniform (e.g. W3C) document object model which can be altered after the html file has been read (e.g. Netscape does not provide changes to the document after the HTML file has been read and the scripts processed)). Yet, using Java proves beneficial for small and simple transformation and processing of XML data files, e.g. navigation bars like hierarchical menus derived from hierarchical data structures, where another frame on the html page is the target of such navigation.

Java is naturally also beneficial for very complex applications which have user interfaces with processing and manipulations which are hard to create with browsers anyways. In this case the difference between a standalone application and an application embedded into a browser is minimal.

3. Why would one want to use XML?

In abstract terms there are three main benefits of using XML: it enables a clear description of domain knowledge, it furthers the interchange of data between disparate systems, and it supports a separation between domain knowledge and representation/processing.

Let's highlight those benefits in two examples.

Financial systems:

  • Flow between front-office, mid-office, back-office systems.
  • Streamlining communication with clients.
  • New web-based services for clients (e-commerce).
  • Deployment costs scaled to complexity of tasks (thin clients, fat clients).

Documentation systems:

  • Overlap in descriptions for various products.
  • One source/multiple output media.
  • Shared information between all departments, e.g. technical, maintenance, documentation, marketing departments.
  • Streamlining of project and product documentation process (workflow management).

Suggestions? Comments? Questions?

Please let us know what you think: info at 2far.com


Copyright (c) 1999-2006 Patricia Hallstein & Axel Kramer