Next Up Previous Contents
Next: 3.5 Marked sections
Up: 3 Marking up your document
Previous: 3.3 Defining entities
[ID index][Keyword index]

3.4 Markup minimisation

SGML was designed to be readable by computers, but it was also designed to be written by humans[Note 4]. It therefore has several features - collectively referred to as markup minimisation - designed to cut down the amount of markup you have to write.

The first such minimisation feature is tag omission: in certain circumstances you may omit tags which are formally redundant. This can happen only when the SGML parser can reliably infer the presence of the omitted tags, and when the DTD author has permitted this. For example, the content of the sect element is a subhead element, followed by zero or more paragraphs (or figures or tables), followed by zero or more subsect elements. That is, the sect element contains the section, rather than containing just the title (as is the case for HTML's H1 element, for example). The subhead element contains a title element, which finally contains the text of the section heading. In other words, the structure of a section is

<sect>
  <subhead>
    <title>Section 1</title>
  </subhead>
  <p>Paragraph text</p>
</sect>
<sect>
  ... <!-- and so on -->

This would be tedious to type. However, the parser can infer the end of the section and paragraph elements (from the start of the next section and the end of the current section respectively), so those closing tags can be omitted; and since the section must start with a subhead, which must start with a title, those tags are redundant, too. That means that this can be compressed to just

<sect>Section 1
<p>Paragraph text
<sect>
...

Many closing tags can be omitted, and a few opening tags, too (apart from subhead, about the only other one is px since it is the only permissable content of elements like abstract). If you include tags redundantly it doesn't matter (and may even make the text clearer in some cases), and if you omit them erroneously, the SGML parser will quickly tell you.

In cases where tags cannot be omitted, you can still cut down the amount you have to type. Instead of typing <em>emphasised text</em>, you can type just <em/emphasised text>.[Note 5]

In certain cases, attributes may have only one of a limited set of values. In this case, specifying one of these values is enough to indicate which attribute you are referring to. For example, the figure element has a `float' attribute, which can take either of the values `float' or `nofloat'. You might normally specify it as <figure float="nofloat">, but since the attribute values come from a limited set, this can be abbreviated to <figure nofloat>. The same is true of the `export' attribute associated with exported IDs.

SGML also has single-character entity references. The character ~ in normal text turns into the entity &nbsp;, and -- (two hyphens) turns into &endash;. Such `short reference' characters are used a little more in the programcode DTD (see Section 5).


Next Up Previous Contents
Next: 3.5 Marked sections
Up: 3 Marking up your document
Previous: 3.3 Defining entities
[ID index][Keyword index]
The Starlink SGML Set
Starlink System Note 70
Norman Gray, Mark Taylor
21 April 1999. Release DR-0.7-13. Last updated 24 August 2001