Bookmarks / SGML exegeses / Kimber - DocBook and Jade for Literate Programming
Kimber - DocBook and Jade for Literate Programming

From eliot@dns.isogen.com Wed Nov  4 09:37:30 1998
Date: Tue, 03 Nov 1998 11:13:47 -0600
From: W. Eliot Kimber <eliot@dns.isogen.com>
Reply-To: dssslist@mulberrytech.com
To: dssslist@mulberrytech.com
Subject: RE: DocBook and Jade for Literate Programming 

At 08:31 AM 11/3/98 -0800, Wroth, Mark wrote:

>and going on to provide an example of how one might approach such an
>implementation.  While I confess I still don't really understand the
>underlying process, the example is causing light to shine dimly through
>the fog.  If that vague understanding is close, then this
>"architectural" approach would appear to make extending and customizing
>a literate programming system much easier than  a straight DTD approach
>would.  

That's the idea. You can specialize your own documents however you want as
long as the specializations are consistent with or completely separate from
the base architecture.  Tools that only understand the base architecture
(e.g., a DSSSL spec written to the architecture) will still be able to
process your documents. Tools that understand your document direction need
not be aware that it also conforms to an architecture. You can interchange
your documents with others directly if they also use architecture-aware
processors. You can interchange with others at the architectural level by
generating a version of your document that only reflects the architecture
(e.g., spam -ALitProgArch -p mycodefile.sgm > interchangefile.sgm).

The architecture becomes an agreement among a community of interest over
what the common things to interchange will be *without* constraining each
member's ability to meet their own local requirements by creating
specialized documents.  The community must also define interchange policies
that say how people are intended to use the architecture for interchange.
For example, within an enterprise that has defined an architecture that
defines the general rules for all documents created by that enterprise, the
policy might be "every element type in a document must be derived from
something in the corporate architecture". This type of policy ensures that
nobody colors too far outside the lines. It can be validated by simple
inspection and enforced by refusing to admit non-conforming documents to
corporate-provided production processes, for example.  Another policy might
be "do whatever you want", we'll figure it out.  This is essentially the
RDF policy, where RDF provides some basic tools but doesn't proscribe how
you use them.

Note that any existing DTD can be used as an architecture (because the
declaration syntax for architectures and DTDs is the same).  However, if
you are trying to enable smooth and productive interchange within a
community of interest, you need to design the architecture explicitly to be
an architecture as you will make different design choices for architectures
designed for general use than you will for DTDs designed for specific use
(similar to the common authoring vs. production distinction).  You can even
use HTML as an architecture if you really want to (at least the versions
that include Div).

Note that architectures are similar to XML name spaces but not the same. In
particular, the formal architecture mechanism enables syntactic validation
of documents against the architecture.  In other words, like name spaces,
architectures define a vocabulary of names, but also define, using existing
syntactic mechanisms and processors, rules for how those names can be
combined. The complete architecture mechanism also provides various markup
minimization features that name spaces do not provide.  [Markup
minimization is not always an advantage--it complicates the implementation
of architectural processing while making it generally easier to use for
authors and DTD designers.  However, you are not required to use or depend
on architectural markup minimization ("automapping").]

Cheers,

E.
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
www.isogen.com
</Address>


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist
From eliot@dns.isogen.com Wed Nov  4 09:56:22 1998
Date: Mon, 02 Nov 1998 10:42:02 -0600
From: W. Eliot Kimber <eliot@dns.isogen.com>
Reply-To: dssslist@mulberrytech.com
To: dssslist@mulberrytech.com
Subject: RE: DocBook and Jade for Literate Programming (The DSSSList  Digest V2
    #178)

At 07:46 AM 11/2/98 -0800, Wroth, Mark wrote:

>Have you looked at W. Eliot Kimber's  "Using SGML Architectures and
>DSSSL to Do Literate Programming"
>(http://www.sil.org/sgml/kimberDSSSLLitProg.html)?  He appeared to be
>attacking the same basic problem, although I confess that his approach
>was beyond me (and appears specialized to DTDs, although I may be
>misinterpreting him badly).

Actually, it's for DSSSL specs, not DTDs.  It takes advantage of the fact
the DSSSL processors operate on the DSSSL architectural instance of the
DSSSL spec, not its base markup, so you can put all sorts of things in your
DSSSL spec, like documentation that will be ignored as a result of the
normal architectural processing that Jade does when it reads a DSSSL spec.

This is relevant to literate programming only if you write your output
processor as an architecture-based process, which might be a good idea. You
could do this with Jade since you can do architecture-based processing of
your input document at no extra cost.

For example, say you define an architecture that provides the fundamental
elements you need to combine code and documentation, let's call it
"LitProgArch". You then write a DSSSL spec in terms of this architecture
(that is, in terms of the element types and attributes the architectural
DTD defines).  You could then create "program documents" that use this
architectural DTD directly or that specialize it.

For example, say our architecture defines three element types:

- LitProgDoc -- Document element
- Code       -- Contains literal source code
- Doc        -- Contains documentation

Declared like so (litprogarch.dtd):

<!-- Literate programming architectural DTD: -->
<!ELEMENT LitProgArch
  - -
  (Code | Doc)*
>
<!ELEMENT Code
  - -
  (#PCDATA)*
>
<!ELEMENT Doc
  - -
  (#PCDATA)*
>
<!-- End of architectural DTD -->

An instance might look like this:

<!DOCTYPE LitProgDoc SYSTEM "litprogdoc.dtd">
<LitProgDoc>
<Doc>This is a program</Doc>
<Code>
print "Hello world"
</Code>
</LitProgDoc>

However, you want more stuff in your documentation, so you create a
specialized DTD derived from the LitProgArch architecture:

<!DOCTYPE MyLitProgDoc [
 <!-- Declare use of Literation Programming architecture: -->
 <?IS10744 arch name="LitProgArch"
   public-id="+//IDN isogen.com//DOCUMENT Literate Programming
Architecture//EN"
   dtd-system-id="litprogarch.dtd"
   doc-element="LitProgDoc"
 >
 <!ELEMENT MyLitProgDoc 
   - -
   (Header,
    CodeSection+)
 >
 <!ATTLIST MyLitProgDoc
   LitProgArch
     NAME
     #FIXED "LitProgDoc"
 >
 <!ELEMENT Header
   - -
   (#PCDATA)* 
 >
 <!ATTLIST Header
   LitProgArch
     NAME
     #FIXED "Doc"
 >
 <!ELEMENT CodeSection  -- NOTE: Not mapped to anything in LitProgArch --
   - -
   (FuncHeader, 
    FuncBody)+
 >
 <!ELEMENT FuncHeader
   - -
   (#PCDATA)*
 >
 <!ATTLIST FuncHeader
   LitProgArch
     NAME
     #FIXED "Doc"
 >
 <!ELEMENT FuncBody
   - -
   (#PCDATA)*
 >
 <!ATTLIST FuncBody
   LitProgArch
     NAME
     #FIXED "Code"
 >
]>
<MyLitProgDoc>
<Header>This is a program</Header>
<CodeSection>
<FuncHeader>A function</FuncHeader>
<FuncBody>
def foo():
  return 1
</FuncBody>
</CodeSection>
</MyLitProgDoc>

If you work out the mapping in your head, it should be clear that you'll
get a document that looks like this when you resolve the mapping to the
LitProgArch (as indicated by the LitProgArch attributes):

<LitProgDoc>
<Doc>This is a program</Doc>
<Doc>A function</Doc>
<Code>
def foo():
  return 1
</Code>
</LitProgDoc>

Note that the CodeSection start and end tags disappear because the element
doesn't map to anything.

To process the specialized document with Jade, you'd do this:

jade -ALitProgArch -dlitprogarch.dsl -tSGML myprogram.sgml > out.py

The -A flag tells Jade to process the input document in terms of its
mapping to the architecture named LitProgArch (the name used in the name=
attribute of the architecture use declaration PI shown above). The rest is
normal.  The DSSSL spec might look something like this:

<!-- litprogarch.dsl -->
<!DOCTYPE dsssl-specification ...>
<dsssl-specification> 
&normal-sgml-output-stuff;
(element LitProgDoc
  (make formatting-instruction 
    data: (literal "# Program generated by litprogarch.dsl")))

(element Doc
  (make formatting-instruction
    data:
     (create-python-comment-block (current-node))))
 
(element Code
  (process-children))
</dsssl-specfication>

Now you have a framework and infrastructure for doing literate programming
that people can specialize to their own ends without the need to completely
re-implement the whole business.  

Obviously, you'd want a much more sophisticated base architecture than I've
shown here, but you get the idea.

Cheers,

E.

--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
www.isogen.com
</Address>


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist
Norman
1 January 2001