In the case of the Starlink General DTD, described in Section 4, the important features were the meanings of the element types. In the case of the programcode DTD, however, the meanings of the element types are fairly straightforward, and the detail is in the structure of the DTD. It therefore seems best to focus on the structure of the DTD here, leaving the detailed descriptions of the elements, and their attributes, to Appendix D.
The programcode DTD includes essentially all of the paragraph-level elements in the Starlink General DTD, that is, everything that may be included in a paragraph in that DTD may also be included in a paragraph in the programcode DTD (except `docxref' and `ref', but with the addition of the `funcname' element).
Figure 5 displays the element structure of the programcode DTD. The syntax is that of a DTD - see Appendix A.2 for brief notes on this.
<!ELEMENT programcode O O (docblock, (codegroup | codereference)+)> <!ELEMENT codegroup - O (docblock, routine+)> <!ELEMENT codereference - O (docblock)> <!ELEMENT docblock O O (title, description?, userkeywords?, softwarekeywords?, authorlist?, copyright?, history?)> <!ELEMENT routine O O (codeopener?, routineprologue, codebody)> <!ELEMENT codeopener O O (#PCDATA)> <!ELEMENT codebody O O (#PCDATA)> <!ELEMENT routineprologue O O ( (routinename, diytopic*)? & (moduletype, diytopic*)? & (purpose, diytopic*)? & (description, diytopic*) & (returnvalue, diytopic*)? & (argumentlist, diytopic*)? & (parameterlist, diytopic*)? & (authorlist, diytopic*)? & (history, diytopic*)? & (usage, diytopic*)? & (invocation, diytopic*)? & (examplelist, diytopic*)? & (implementationstatus, diytopic*)? & (bugs, diytopic*)? )> <!ELEMENT routinename O O (name, othernames?)> <!ELEMENT moduletype - O (#PCDATA)> <!ELEMENT name O O (#PCDATA)> <!ELEMENT othernames - O (name+)> <!ELEMENT purpose - O (%p.model)> <!ELEMENT title O O (#PCDATA)> <!ELEMENT description - O (%paralist;)> <!ELEMENT (userkeywords | softwarekeywords) - O (#PCDATA)> <!ELEMENT returnvalue - O (%paralist;)> <!ELEMENT (argumentlist | parameterlist) O O (parameter*)> <!ELEMENT parameter - O (name, type, description)> <!ELEMENT type - O (#PCDATA)> <!ELEMENT examplelist O O ((example,description)+)> <!ELEMENT example - O (#PCDATA)> <!ELEMENT (usage | invocation | implementationstatus | bugs) - O (%paralist;)> <!ELEMENT diytopic - O (title, %paralist;)> <!ELEMENT copyright - O (%paralist;)> <!ELEMENT authorlist O O ((author+ | authorref+), otherauthors?)> <!ELEMENT otherauthors - O (author+ | authorref+)> <!ELEMENT author - O (name, authornote?)> <!ELEMENT authornote - O (%paralist;)> <!ELEMENT history O O (change+)> <!ELEMENT change - O (%paralist;)> <!ELEMENT funcname - - (#PCDATA)> <!ELEMENT webref - - (%simpletext)+> <!ELEMENT url - - (#PCDATA)> |
Figure 5: Element structure of the programcode DTD |
Below, I describe these elements group-by-group. This description concentrates on the structure of the DTD and the relationships between the elements - I have not described the details of the elements or their attributes where these can be found in the detailed element listing in Appendix D.
The<!ELEMENT programcode O O (docblock, (codegroup | codereference)+)> <!ELEMENT codegroup - O (docblock, routine+)> <!ELEMENT codereference - O (docblock)> <!ELEMENT docblock O O (title, description?, userkeywords?, softwarekeywords?, authorlist?, copyright?, history?)>
programcode
top-level element, like the
codegroup
and codereference
elements which it contains,
starts off with a docblock
element. This may provide
discussion, author, copyright, change history information, or it may
give as little as a title. Where this information is provided is up
to the author of the documentation. The elements in the
docblock
must be present in the order specified here.A codegroup
element simply
gathers together several related functions (this is deliberately
vague); it might therefore represent all the functions defined in one
source file, or in one directory of a source tree.
A codereference
is even vaguer: it documents a relationship
between the current programcode document and another one. In the case
of the DSSSL DTD, this is mapped to the structure in that language which
included one source file in
another; in the case of the Fortran DTD, it could document the
dependence of a source file on an `include' file.
A<!ELEMENT routine O O (codeopener?, routineprologue, codebody)> <!ELEMENT codeopener O O (#PCDATA)> <!ELEMENT codebody O O (#PCDATA)> <!ELEMENT routineprologue O O ( (routinename, diytopic*)? & (purpose, diytopic*)? & (description, diytopic*) & (returnvalue, diytopic*)? & (argumentlist, diytopic*)? & (parameterlist, diytopic*)? & (authorlist, diytopic*)? & (history, diytopic*)? & (usage, diytopic*)? & (invocation, diytopic*)? & (examplelist, diytopic*)? & (implementationstatus, diytopic*)? & (bugs, diytopic*)? )>
routine
element documents a function, with arguments, a return
value, and the like.
The codebody
element is ignored by the processing system,
but is still scanned by the parser. This could cause you a problem if
there's anything in there which looks like something the parser would
be interested in, namely an element start-tag, an entity reference, or
something that looks like markup.
The ampersand and left angle-bracket are only recognised as markup if
they are immediately followed by a name-start character (upper- or
lowercase letter); markup is something starting with the string
<!...
.
If the parser trips up on something, there are two
things you can do. You can make minor edits
to your source code (adding a space character
will always be enough), to stop things looking like markup:
<a
is the beginning of an element start-tag, but
< a
, with an interpolated space, is not.
Alternatively, you can bracket the code in a CDATA marked section
(Section 3.5.1) as follows
Which of these alternatives you prefer is largely a matter of taste, I think, but remember that you'll only have to do this for those source-code files which you include within your documentation. It is undeniable that these strategies are ugly, but something like this is fairly inevitable as the downside of having your source processable by more than one system at once. If both of these are unacceptable to you (on aesthetic grounds if nothing else), then you can always preprocess your sources to strip the code out and leave a `pure' SGML document.[Note 8]* ... end of code prologue *-<![CDATA[ ...fortran code including <ignored &markup... *]]>
Note: If the codebody
element actually contains no code at all
(perhaps because the document has been generated by a preprocessor
stage), then you should include the attribute `empty' in the
start-tag; this has no effect at present, but could become significant
if the documents are repurposed in future versions of this set.
The routineprologue
element contains all the `meta-information'
for the routine, such as authorship, argument list, return value and
the like. The declaration here looks particularly complicated, but
that is largely due to an unwieldiness in SGML's DTD syntax. This
declaration simply states that each of the routinename
,
purpose
, etc., elements may appear at most once, but that each
of these elements may be freely interspersed with diytopic
elements. Only the description
element must appear.
The<!ELEMENT routinename O O (name, othernames?)> <!ELEMENT moduletype - O (#PCDATA)> <!ELEMENT name O O (#PCDATA)> <!ELEMENT othernames - O (name+)>
routinename
element has structure, though in the usual case
(<routinename
helloworld>) you wouldn't notice this. The
othernames
element is useful when a function has some generic
name, say allocarray
, plus some specific names, say
allocarray_int
and allocarray_float
. The
moduletype
element allows you to document that a particular
module is a <moduletype>Perl script
, for example.The distinction between<!ELEMENT purpose - O (%p.model)> <!ELEMENT title O O (#PCDATA)> <!ELEMENT description - O (%paralist;)> <!ELEMENT (userkeywords | softwarekeywords) - O (#PCDATA)> <!ELEMENT returnvalue - O (%paralist;)> <!ELEMENT (argumentlist | parameterlist) O O (parameter*)> <!ELEMENT parameter - O (name, type, description)> <!ELEMENT type - O (#PCDATA)> <!ELEMENT examplelist O O ((example,description)+)> <!ELEMENT example - O (#PCDATA)> <!ELEMENT (usage | invocation | implementationstatus | bugs) - O (%paralist;)> <!ELEMENT diytopic - O (title, %paralist;)> <!ELEMENT copyright - O (%paralist;)>
purpose
and description
is that
purpose
is intended for a brief, perhaps one-line, summary
of the function, whereas description
is intended for a longer
discussion.The description
element is used in the docblock
,
codeprologue
, miscprologue
and parameter
elements, authorlist
is used in both codeprologue
and
docblock
elements, and name
is used in the author
,
othernames
, parameter
and routinename
elements.
The diytopic
element is for other notes which aren't
otherwise covered by the element types listed here. It has a very
simple structure: a title followed by paragraphs of text.
The distinction between the userkeywords
and
softwarekeywords
elements is that the former is intended to
supply keywords to help the final user of the software, whereas the
latter is intended to be a home for keywords concerned with the
categorisation of the software within the Starlink project.
The text %p.model;
indicates that at this point, any of the
paragraph-level elements from the Starlink General DTD may be used,
with the exception of the `docxref' and `ref' elements, and the
addition of the `funcname' element.
The text %paralist;
is shorthand for p, (p |
tabular)*
, or in other words, a sequence of paragraphs and tabular elements.
Each<!ELEMENT authorlist O O ((author+ | authorref+), otherauthors?)> <!ELEMENT otherauthors - O (author+ | authorref+)> <!ELEMENT author - O (name, authornote?)> <!ELEMENT authornote - O (%paralist;)>
author
element must be given an ID. The down-converter
which processes the document will assume that authors with the same ID
are the same author, and will attempt to assemble a full set of
information about that author (ie, email address, webpage) from the
various available author elements with the same ID and, for example,
assemble a list of all the authors represented in a collection of code
at the top of a codecollection
element. You should probably try to make the information given in these
scattered author elements consistent, although the down-converter
won't impose this.The history mechanism in programcode documents is intentionally simple, as it merely emulates the list-of-changes style in the majority of the Starlink code-base. Specifically, it is simpler than the history mechanism in the General DTD (see Section 4.5). The<!ELEMENT history O O (change+)> <!ELEMENT change - O (p+)>
change
element has a
required date, and a required `author' attribute, which links back to
a previous author
element.The only unusual element is<!ELEMENT funcname - - (#PCDATA)> <!ELEMENT webref - - (%simpletext)+> <!ELEMENT url - - (#PCDATA)>
funcname
, which
is intended to indicate other functions within the same `world'
(vagueness again): these could be language primitives, or other
documented functions. At present, this simply functions as a variant
of the code
element, but the system could be extended in future
to generate cross-references for these.