4 Macros for Transforming XML

WebIt! provides a system of "macros" for transforming XML in Scheme.

Two types of macros are supported:

Syntax-rules-like pattern matching is provided via the xml-rules and xml-case syntaxes.

4.1 An Example

This example illustrates a simple poetry "markup language", and a set of macros to transform a collection of poetry into HTML. The macro for the "book" element constructs a table of contents for the collection.

This poetry markup langauge is very "structural" and consists of five elements: book, toc (table of contents), poem, stanza and line. The example below illustrates the markup of a single poem:

(poem
 title:
 "Bitter for Sweet"
 poet:
 "Christina Rossetti"
 tag:
 "bitter"
 (stanza
  (line "Summer is gone with all its roses,")
  (line "Its sun and perfumes and sweet flowers,")
  (line "Its warm air and refreshing showers:")
  (line "And even Autumn closes."))
 (stanza
  (line "Yea, Autumn's chilly self is going,")
  (line "And winter comes which is yet colder;")
  (line "Each day the hoar-frost waxes bolder")
  (line "And the last buds cease blowing.")))

The "keywords" title:, poet:, and tag: are used in to construct the corresponding XML attributes title, poet, and tag. (Keywords are also used to pattern match XML attributes in the xml-rules and xml-case forms.)

I now define a stylesheet and the macros needed to transform this poem into HTML:

(define poetry-stylesheet (stylesheet ...macros-and-micros...))

The "poem" macro converts an entire poem into an HTML div element. The title of the poem is in a bold font; the author in italics. The xml-rules form provides pattern matching capabilities similar to syntax-rules. In the macro below, the poem is destructured into its title, poet's name, and the text of the lines of each stanza. In turn, the values of these are output in the xml-template form to construct the HTML-formatted poem.

(xml-macro
 poem
 (xml-rules
  ((poem title: t poet: a tag: m (stanza (line l1) (line l) ...) ...)
   (xml-template
    (h4:div
     (h4:p)
     (h4:a h4:name: m)
     (h4:strong t)
     (h4:br)
     (h4:em a)
     (list (h4:p) l1 (list (h4:br) l) ...)
     ...)))))

A collection of poetry can be represented with the book element:

(define
 rossetti-collection
 (book
  title:
  "Poems of Christina Rossetti"
  (poem
   title:
   "Bitter for Sweet"
   poet:
   "Christina Rossetti"
   tag:
   "bitter"
   (stanza (line "Summer is gone with all its roses,") ...)))
 (poem
  title:
  "The First Spring Day"
  poet:
  "Christina Rossetti"
  tag:
  "spring"
  (stanza (line ...))
  (poem
   title:
   "On Keats"
   poet:
   "Christina Rossetti"
   tag:
   "keats"
   (stanza (line ...)))
  ...))

The tag attribute is used in constructing the table of contents (to provide the target/anchor in the links to each poem).

With this micro for the book element, we construct an full HTML page which contains a table of contents and the poems themselves:

(xml-micro
 book
 (xml-rules
  ((_ title: bt p ...)
   (h4:html
    (h4:head (h4:title (xml-template bt)))
    (h4:body
     (h4:h1 (xml-template bt))
     (xml-expand
      (xml-template (toc p ...))
      (xml-expand (xml-template (list p ...)))))))))

The xml-template appears only deep in the output expression: in the places where it is necessary to introduce values bound by pattern matching. The first call to xml-expand is expands a "table of contents" element, via a separate macro. Similarly, the poems are expanded (via the second call to xml-expand) in to HTML by a separate macro for the poem element.

To complete the stylesheet, we give the macro to translate the table-of-contents element (toc) into HTML:

(xml-macro
 toc
 (xml-rules
  ((toc (poem title: t poet: a tag: m . rest) ...)
   (xml-template
    (list
     (h4:p)
     "Table of Contents:"
     (h4:ul (h4:li (h4:a h4:href: (string-append "#" m) t)) ...))))))

Lastly, we now define the poetry "expander":

(define poetry->html (stylesheet->expander poetry-stylesheet))

(poetry->html rossetti-collection) will return an RS-XML value containing the HTML for the example poetry book.

4.2 The xml-rules Syntactic Form

The xml-rules form is used in the example above. It's syntax is

   (xml-rules clause ...)

where each clause in xml-rules/xml-case consists of

   (pattern output)

or

   (pattern fender-expression output)

The pattern language consists of

   xml-pat = var | ele-pat
   ele-pat = (ele-tag [keyword: attr-pat]* xml-pat*)
           | (ele-tag [keyword: attr-pat]* xml-pat* . var)
           | (ele-tag [keyword: attr-pat]* xml-pat* xml-pat ...)
   attr-pat = var | (var expr)

The pattern may indicate a default value to be used if an attribute is not present in the element being matched. To successfully match an element, all attributes mentioned in the pattern must be in the element (or have a default value other than #f specified in the pattern). Note, not all attributes in the element need be present in the pattern.

The pattern for an element may end in a ". var", binding all remaining children nodes to var. Equally, the end of a pattern may be ellipses. Currently, ellipses may only appear at the end of an element pattern.

The ele-tag position in the pattern may be either the name of an element constructor function or _. Where _ is used, the type of the element is not tested. Most typically, I intend using _ in the manner suggested in the example above: where the element tag has already been checked by the macro expander, to select a macro, the tag need not be re-matched in the xml-rules construct.

The output expression may be any Scheme expression. But an xml-template form may appear in any valid expression position within that expression, to introduce the contents of pattern variables into the output. The xml-template form has the following syntax: (xml-template xml-tplt), where xml-tplt is:

     xml-tplt = var | ele-tplt
     sub-tplt = xml-tplt
              | xml-tplt ...
     ele-tplt = (ele-tag [keyword: attr-tplt]* sub-tplt*)
              | (ele-tag [keyword: attr-tplt]* sub-tplt* . var)
              | function call
     attr-tplt = var | function call

I allow the value of an attribute in the output to be computed. Equally, I use can calls to list to construct a repeated output sequence. Note that in the output template, elipses need not appear only at the end of an element. One can write

(h4:div an-item ... (h4:p))

Any variable in the output xml-template form which is not bound in the pattern is expected to be bound in the current lexical environment.

One limitation currently is that it is not possible to bind an element constructor to a pattern variable, and use this in the output expression.

4.3 The xml-case Syntactic Form

xml-case provides a form which can dispatch based on matching of patterns against XML node. The syntax is:

(xml-case exp clause ...)

xml-case takes an expression which must evaluate to an XML node and one or more clauses. Each clause has the same syntax as that of clauses in xml-rules .

Below is an example of using xml-case to perform the transformations of an XML poetry book, which was earlier shown using a combination of macros and micros.

(define
 (book->html b)
 (xml-case
  b
  ((_
    title:
    bt
    (poem title: t poet: a tag: m (stanza (line l1) (line l) ...) ...)
    ...)
   (xml-template
    (h4:html
     (h4:head (h4:title bt))
     (h4:body
      (h4:h1 bt)
      (h4:p)
      "Table of Contents:"
      (h4:ul (h4:li (h4:a h4:href: (string-append "#" m) t)) ...)
      (h4:div
       (h4:p)
       (h4:a h4:name: m)
       (h4:strong t)
       (h4:br)
       (h4:em a)
       (list (h4:p) l1 (list (h4:br) l) ...)
       ...)
      ...))))))

4.4 The markup Syntactic Form

The syntax "markup" provides a simple way to destructure an XML element. Each of the "keyword arguments" can be used to bind an attribute of the element. Where an attribute is not present in the element being destructured, #f is bound to the attribute variable. Alternatively,the markup syntax allows a default value to be supplied. The remaining formals are bound to the child elements of the XML element.

The syntax of the markup form is

  (markup ([keyword: variable]* variable+)
     expression+)

Markup expands into a lambda expression.

Below is an example macro, for the poem element, which uses the markup form:

(xml-macro
 poem
 (markup
  (title: title poet: author . contents)
  (h4:div (h4:br (h4:strong title)) (h4:p (h4:em author)) contents)))

Last modified: Sunday, January 30th, 2005 1:57:37pm
HTML generated using WebIt!.