Notes

Note 1

To avoid ambiguity, this somewhat arcane note needs to be here. The terms `SGML system' and `SGML application' have precise meanings in the SGML world. An SGML system is a program such as Jade or SP which parses SGML. An SGML application is a collection of DTDs and other supporting documents (see the standard [iso8879], clause 4.279). This package is therefore properly referred to as an SGML application, but since this term could be confusing, I will refer to this package instead as the (Starlink) SGML Set.

Note 2

In fact, there is a version of TeX which produces PDF files directly, and a TeX-to-HTML converter called TeX4ht (see <http://www.tug.org/applications/tex4ht/mn.html>) which writes special DVI files which work with a postprocessor, and so uses the TeX parser to produce HTML indirectly. This really just pushes the hack elsewhere.

Note 3

Note for pedants: There is a difference between `elements' and `element types': the former are things which appear in documents, with data in them; the latter are the abstract things defined by the DTD. The distinction is not particularly important outside of a DTD, however, so I will not continue to make it in the description of the element types below. It will always be possible to make the distinction from the context anyway.

Note 4

This is unlike XML, which is likely to be written largely by authoring programs

Note 5

Note that this is different from the default SGML minimisation, which would have <em/emphasised text/ - in order to be able to write XML-style <ref/> for empty elements, we had to change the form of this particular minimisation option. This much is a convenient shortcut. SGML defines other tag minimisation locutions, so that <p<em/emph></> is legal. This rarely improves readability, can get you into terrible messes, and is part of the `parser hell' which XML was designed to avoid. I mention it purely for completeness; some of these rather more extreme minimisation functions have been disabled in this SGML application.

Note 6

MathML was considered, but is neither well-supported in browsers, nor designed to be easily written by hand.

Note 7

For example, &amp; inside an <meqnarray> element results in a literal ampersand in the maths, rather than being interpreted as alignment characters. I could go on (you guessed!), but even at this temporal distance, I feel my reader's tolerance for parser detail gurgling down the plughole.

Note 8

Note for SGML initiates: it does seem a little disappointing that none of SGML's various escaping mechanisms could help here, but the fact that the code text has to remain pretty much inviolate (apart from the possibility of a few spaces here and there) is rather restrictive. Another possibility I considered was making the codebody element have CDATA content, even though that's generally deprecated in the most lurid terms. Far from solving the problem, this would make things worse, however: the </ which ends the element would still be magic, you'd have to have an explicit </codebody> closing the element (no entities recognised), and you'd have to have an explicit <codebody> starting the element, since the *- short reference doesn't seem to permit the element to be ended with the </codebody> end-tag.

Note 9

Unusually, the common misspelling `straightened' is also appropriate here.

Note 10

In fact the sgml2docs command produces a file called doc.tar, so running either of these commands directly after the other would overwrite the result of the former one.

Note 11

Note for pedants: there is a distinction between document type declaration and definition. The document type definition is the collection of rules which specifies which elements can go where, what attributes they have, and so on; the declaration is the <!DOCTYPE...> invocation at the top of the document file - the `document instance' in SGML parlance - which associates that instance with a particular definition. The abbreviation `DTD' usually refers to the definition, and it is the definition that this section is about.

Note 12

Note that this currently doesn't work fully - there's some defect in the HyTime declaration of the docxref element type which I haven't been able to identify.

Note 13

Although I believe Clark was involved with the specification of the DSSSL standard, he has said (on the DSSSList discussion list, May 1999) that the transformation language has significant weaknesses.

Note 14

It's a matter of taste whether you prefer Perl or DSSSL. DSSSL, based on Scheme Lisp, is undeniably odd, but I've grown to rather like it. A language with no assignments, no loops, no real sequences of actions, and where absolutely everything is a function, has a certain rather twisted glamour to it, like a nature programme about fish-life four miles down the Marianas Trench.