]> XML briefing <authorlist> <author email='norman@astro.gla.ac.uk' webpage='http://www.astro.gla.ac.uk/users/norman/' affiliation='University of Glasgow' id=ng >Norman Gray <keyword>XML <keyword>XSL <keyword>XPointer <keyword>XLink <keyword>SGML <keyword>HyTime <history> <version date='03-FEB-1998' author=ng number=1>Initial version <distribution string=1 date='20-OCT-1998' author=ng >Initial distribution <change date='09-JAN-1999' author=ng>Updated to new version of DTD. Various typos corrected. <change date='15-JUN-1999' author=ng>DTD still in flux... </history> <abstract>This is a short briefing on XML, and its relationship with SGML. It is intended as a brief overview, and pointer to more detailed resources.</abstract> <p>The primary XML resources are <ul> <li><url>http://www.w3.org/XML</url> for the W3C's XML spec <li><url>http://www.ucc.ie/xml/</url> for the XML FAQ <li><url>http://www.xml.com/xml/pub/axml/axmlintro.html</url> for the annotated spec </ul> <p>I also have a <webref url='&bookmarks/lists-xml.html' >collection of pointers</webref> with links to other important resources. <sect>XML is SGML-- <p><ul> <li>An SGML document consists of an `SGML declaration' which sets various options, a `document type definition' (DTD) which establishes the syntax of a document type, and a `document instance', which is the actual document. <li>XML has a single, fixed, SGML declaration, which sets most of the SGML options to `off'. For example, in XML all element names are case sensitive, there is no tag omission, there are some restrictions on the possible syntaxes expressible by the DTD, and more exotic features such as SUBDOC are forbidden. For a more detailed discussion of the differences, see <webref url='http://www.w3.org/TR/WD-xml-lang.html#secA.' >appendix A</webref> of the spec. <li>This means that parsers are easy to write, and there are numerous such parsers available for free. <li><p>XML dispenses with the SGML declaration; it can dispense with a DTD as well. XML introduces the notion of `well-formed' versus `valid' documents. <p>If a document has all closing tags present, and all elements properly nested, and starts with the declaration <blockquote><code><![ cdata [ <?xml version="1.0" standalone="yes"?> ]]></code></blockquote> and empty elements are written <code><![cdata[<empty/>]]></code>, then it is `<webref url='http://www.ucc.ie/xml/#FAQ-WF' >well-formed</webref>', and may be processed in the absence of a DTD. <p>A file which has a DTD and which conforms to it (which will also be well-formed), is `<webref url='http://www.ucc.ie/xml/#FAQ-VALID' >valid</webref>'. It may optionally also begin with the XML declaration <blockquote><code><![cdata[<?xml version="1.0"?>]]></code></blockquote> <li>That is, a valid XML document is also a conforming SGML document. This has been made possible by recent subtle, technical, changes to the SGML standard. <li>The latter has come about because there has been close cooperation between the developers of XML and the wider SGML community. That is, XML is fully legit as SGML. </ul> <sect>XML is not HTML++ <p><ul> <li>XML is a real standard (well, there <em/are/ HTML standards, but noone pays any attention to them). <li>HTML has a fixed element set, and associates fixed semantics with those elements. XML has neither restriction. </ul> <sect>Associated standards <p><ul> <li><webref url='http://www.w3.org/TR/WD-xlink' >XLink</webref> is a draft specification for links in XML. It's closely related to the hyperlinks module of HyTime. <li><webref url='http://www.w3.org/TR/WD-xptr' >XPointer</webref> is a draft specification for location specifiers in XML, so that you can refer, for example, to `the second section beneath the element with id so-and-so'. As with XLink, it's closely related to HyTime. <li><webref url='http://www.w3.org/Style/XSL/' >XSL</webref> are style sheets for XML. These are vital if XML is to be readable when it is served over the web (because it doesn't have the fixed semantics HTML has, XML rendering can't be left entirely to a browser). <li>The <webref url='http://www.w3.org/DOM/' >Document Object Model</webref> (DOM) is `a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents' [from the spec]. It's a simple set of O-O declarations for querying and manipulating XML documents in simple ways (small subset of DSSSL). <li><webref url='&bookmarks/lists-hytime.html' >HyTime</webref> is a very high-level standard for associating semantics with SGML DTDs. <li><webref url='&bookmarks/lists-dsssl.html' >DSSSL</webref> is the Document Style and Semantics Specification Language. It's a language for writing stylesheets in. Both HyTime and DSSSL are specific to SGML, but have informed the other standards above. </ul> <sect>Future developments <p><ul> <li>SGML technology will work with XML (as long as it conforms to the minor technical corrigenda mentioned above) <li>Because XML is <em/much/ easier to parse than fully general SGML, it is <em/much/ easier to produce parsers for it. It is therefore very likely that we will soon see many XML editors and XML-aware browsers in the months to come. <li>We should also see XML-aware search engines, potentially finally realising the possibilities offered by hightly structured information storage and retrieval. <li>The development of <webref url='&bookmarks/lists-maths.html' >MathML</webref> should help see maths on the internet </ul>