|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
EsisWriter | Provides the writing functions needed by an EsisHandler |
Class Summary | |
---|---|
EsisHandler | Writes out a SAX stream in a format based on the sgmls ESIS output. |
EsisParser | A parser which can interpret the pseudo-ESIS syntax of EsisHandler . |
StreamEsisWriter | Writes ESIS output to a stream, taking care of encodings and line separators |
Writes out a SAX stream in a format based on the sgmls ESIS output. This original format is defined by sgmls. The original point of the format was that it should be easy for downstream tools to parse. The point here is that it turns an XML file into an unambiguous byte-stream and, further, that it permits a normalisation operation which is both well-defined and simple.
There isn't a complete overlap between the ESIS and the SAX model, so there are some differences. All the differences here are extensions rather than changes.
The output consists of a sequence of lines, separated by CR LF (ie
bytes 0xd 0xa
). Each line consists of a start character
indicating which type of output record it represents, followed by one
or more arguments. There are always the same number of arguments,
separated by a single space.
Mprefix uri | start prefix mapping | extn |
mprefix | end prefix mapping | extn |
Aattname CDATA value | declare attribute | ESIS |
Bnamespace localname CDATA value | declare namespaced attribute | extn |
(name | start element | ESIS |
[namespace localname | start namespaced element | extn |
)name | end element | ESIS |
]namespace localname | end namespaced element | extn |
-text | character content | ESIS |
=text | ignorable whitespace | extn |
?pi data | processing instruction | ESIS |
Xname | skipped entity | extn |
An important function of this class is to normalise the ESIS output. We do this in the following ways:
signature
is removed.Each start element event is preceded by the set of attributes on that event.
The result of this is to turn the XML:
<doc><ns:p class='foo' xmlns:ns="urn:namespace" ns:att='bar'>Hello</ns:p> <p> there, chum </p> </doc>
into the (unnormalised) ESIS form:
(doc Mns urn:namespace Aclass CDATA foo Burn:namespace att CDATA bar [urn:namespace p -Hello ]urn:namespace p mns -\n (p - there,\nchum\n )p -\n )doc
This can also be given the normalised form:
(doc Aclass CDATA foo Burn:namespace att CDATA bar [urn:namespace p -Hello ]urn:namespace p (p -there,\nchum )p )doc
In the normalised form, the prefix mappings have been removed (the prefixes are not semantically important), leading and trailing whitespace has been removed from the ‘-’ lines, and all-whitespace ‘-’ records have been removed.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |