Experiments Using XSLT With Topic Maps

Author: G. Ken Holman
Date: $Date: 2000/12/06 16:52:26 $(UTC)

1	Abstract
2	Overview
2.1	Introduction
2.2	Objective
2.3	Approach
2.4	Example used for illustration
2.5	Topic Maps for rendering Topic Maps
2.6	Information flows
2.6.1	The environment supporting the data flows
2.6.1.1	Data files and topic maps
2.6.1.2	Transformation scripts
2.6.1.3	Support files
2.6.1.4	Stylesheets used for rendering
2.6.1.5	Working directory
2.6.1.6	Resulting files
3	What happened
3.1	Everything ended up working
3.2	The Topic Map document model used
3.3	Topic Map conversion
3.4	Topic Map merging
3.5	Topic Map rendering
3.5.1	Rendering inheritance for individual topics
3.5.2	Rendering independence
3.5.3	Automation
3.5.3.1	Automation environment
4	Lessons learned and tips used with XSLT
4.1	Non-prefixed names in XPath
4.2	Reducing namespace declarations in the result
4.3	Conditional expressions in XPath
4.4	Avoiding collision in named constructs
4.5	Text no-operation instructions
4.6	Stylesheets writing stylesheets with legible results
4.7	Parameterized processing of source node trees
4.8	Protecting stylesheets from being used in the wrong context
5	Conclusions
5.1	Desired changes to the topic map document model
5.2	Desired changes to XSLT
5.3	Performance
5.4	Summary
6	Continuing Effort

1: Abstract

To consider the automated rendering of topics in a topic map using different stylesheets for each topic, these experiments exclusively using XSLT demonstrate that stylesheet technology is sufficient to the objective of topic map visualization, though possibly not ideal for some of the tasks involved nor scalable to large topic maps. Nevertheless the concepts are still applicable even if better supported with technology other than XSLT. The environment developed and illustrated supports the merging of topic maps, distilling the invocation required for stylesheet application and the generation of a complete set of hyperlinked pages rendering the content of a topic map.

2: Overview

2.1: Introduction

At the time of starting to write this paper (September/October 2000), XML Topic Maps (XTM) were still in the development phase without a complete specification having been published. This did not prevent exploring how transformation technology can address a number of areas of working with XML-based topic maps because assumptions can be made regarding the pending publications that do not impact the conceptualization of topic map conversion, merging, and rendering for visualization. Though the author can be accused of viewing all things as a nail when working only with a hammer, the purpose of these experiments is not to prove XSL Transformations (XSLT) is a viable tool for production work. Rather, the purpose is to witness functional processes implementing topic map conversion, merging and rendering to assess the utility of these processes.

A secondary aspect was that the experimentation was the author's first foray into authoring and working with Topic Map concepts after long academic study without practical application. The author attempted to use a Topic Map wherever possible in order to gain practical experience at the features and foibles of their creation and use.

Given the already published changes to the draft XTM architecture into a recently announced final model, many changes to these stylesheets will be required to be of production quality.

2.2: Objective

Given that the topic map used for piloting around instances of information, the components of the map need to be rendered to give the user the actual navigation tool that is traversed.

Early deployments of topic maps on web sites typically have had a fixed presentation of all topics. To some users there may be meaningful reasons of user interface consistency to promote the identical rendering of all topics. There are obvious reasons to have a single presentation, including the possibly thousands or millions of topics that will be in very large maps. This is not reference to the information being mapped, but only to the topic map itself that has mapped the information.

What if it were desired that topics of different types be rendered in different ways? How would a system manage differentiation of the various transformation scripts producing the final visualization? How would one associate rendering appearances with different topics? How would one do so with the flexibility of having alternative sets of renderings for topic maps?

These sets of experiments endeavored to determine the role that stylesheet technology, specifically XSLT (XSL Transformations), could play in the rendering of the information found in a topic map based on the XTM (XML Topic Maps) document model described by a DTD (Document Type Definition).

The objective is to accommodate a topic map that uses some interchange syntax (which may or may not conform to the XML Topic Map document model) and to visualize the information in that topic map as rendered to a set of hyperlinked web pages.

The overview of the process is summarized as

transform a non-conformant instance to an XML Topic Map instance
create the rendering control scripts for the desired appearances of topics
associate the topic map with the desired rendering
engage the automated rendering of the topic map information to produce a static set of results for browser navigation
navigate the resulting interlinked pages

2.3: Approach

Considering that we are given a topic map of information to be rendered, perhaps that topic map uses the XTM DTD as a base architecture and doesn't use XTM directly. Alternatively, perhaps the topic map expresses relationships using an archaic or customized topic map model. In both cases a system architecture based on the hub-and-spoke metaphor relies on a single document model at the core. Other document models are accommodated in this architecture through transformation, transmuting the given input information to conform to the core document model. The intermediate XTM DTD is used in these experiments as the core document model.

How information is rendered is an aspect of the information's definition and is a candidate for inclusion in a topic map. Separating the rendering information from the map itself provides flexibility for accommodating alternative renderings. In these experiments, individual stylesheets are defined for the rendering of different topics. These stylesheets are associated with the information to be rendered by defining topic map associations directly to each of the topics. Although only XSLT stylesheets are associated with topics in these experiments, the model provides for alternative occurrences, in different scopes, of rendering stylesheets for each topic, each of which are candidates for use in different target environments.

To avoid the burden of specifying a stylesheet for every topic in the map, topics without associated rendering information are rendered by the closest specification of rendering information found in the instance/class hierarchy of the topic. If the topic is an instance of a class described by another topic, the class topic is examined for rendering information. If, in turn, that class does not have associated rendering information, but is an instance of another class described by yet another topic, that class topic is examined for rendering information. The environment includes a default rendering for any class for which rendering information is not specified.

The topic map with the desired rendering is merged with the topic map of information to be rendered to create the topic map used by the system to generate the resulting static pages with which to navigate the information. While the rendering information could be hardwired into the topic map, this environment provides for combining rendering information from a supplemental topic map that is merged with the information topic map. An assumption is made regarding sequence priority, similar to a cascade, such that the supplemental rendering information takes precedence over any built-in rendering information. Naturally, a topic map is used to associate the topic map of information being rendered with the topic map of rendering information.

The detailed approach of the process is summarized as

create the rendering control scripts for the different desired appearances of topics
associate in a rendering topic map each of the scripts with the topics they render
- could make as many different topic maps with different associations for the topics as possible desired map renderings
associate the information topic map with the desired rendering topic map
merge the information topic map with the rendering topic map into a map that is rendered by the process
engage the automated rendering of the merged topic map to produce a static set of results for browser navigation
navigate the resulting interlinked pages

2.4: Example used for illustration

Michel Biezunski created an early topic map for the conference proceedings of XML Europe '99, named xmle99.xml in these experiments. This is a compact topic map useful for illustrative purposes, even though it doesn't follow the current interchange syntax. In these experiments, the XSLT script mb2xtm.xsl was used to transform the topic map to an instance conforming to the draft XTM DTD being used for experimentation named xtme99.xtm.

Of the many associations in the map, two INSTANCE_CLASS associations are that both Norway and France are countries. Both Norway and France are mentioned in the collection of materials being mapped, thus these topics have occurrences to locations outside of the map itself.

Figure 1: Three of the many topics

Two stylesheets were written to display the topic information from the map regarding countries: norway.xsl and country.xsl. These are respectively associated with the two topics for rendering purposes in a rendering topic map xtme99-render.xtm. A specific stylesheet for the France topic is not supplied in this environment. Note how the stylesheet occurrences are typed in their occurrence roles for XSLT; this provides for alternative rendering scripts to be occurrences of the respective rendering topics.

Figure 2: Two of the rendering topics

A control topic map named xtme99-ctrl.xtm was written to associate the information topic map xtme99.xtm with the rendering topic map xtme99-render.xtm. Note this design provides for alternate control topic maps associating the information topic map with alternative rendering topic maps.

Figure 3: Associating the information with the rendering

The merged topic map collects all the occurrence and rendering information found in both input maps (as well as, but not shown in the diagram, the control topic map). The following excerpt illustrates some of the information regarding the country-oriented topics in focus.

Figure 4: Resulting associations

Note that though no direct rendering information is provided for the topic for France, the class related country topic indicates the stylesheet to use. This instance-class relationship can be traversed for all topics until a topic indicates the stylesheet to use. All topics are inherently related to the topic for topics, and that topic itself has a stylesheet association thus ensuring that all topics are visualized.

2.5: Topic Maps for rendering Topic Maps

The following is the "xmle99-render.xtm" rendering topic map describing the rendering for the topics in the "xmle99.xtm" information topic map (note how this simple example only specializes the display of countries using "country.xsl" and then further specializes the display of Norway using "norway.xsl"; otherwise, all other topics are rendered using "topic.xsl"):

<?xml version="1.0" encoding="utf-8" ?>
<!--xmle99-render.xtm

A topic map for associating the rendering of particular
topics from a topic map.

See http://www.CraneSoftwrights.com/#reading for a link to the paper.
-->
<!DOCTYPE topicmap
  PUBLIC "-//TopicMap.org//DTD XML Topic Map 1.0//EN" "xtm1.dtd">
<topicmap xmlns:xlink="http://www.w3.org/1999/xlink">

   <!--topic-specific rendering information-->

   <topic id="r-TOPIC-TYPE">
      <topname>
         <basename>TOPIC-TYPE</basename>
         <dispname>TOPIC-TYPE</dispname>
         <sortname>TOPIC-TYPE</sortname>
      </topname>
   </topic>

   <topic id="r-COUNTRY" types="r-TOPIC-TYPE">
      <topname>
         <basename>COUNTRY</basename>
         <dispname>COUNTRY</dispname>
         <sortname>COUNTRY</sortname>
      </topname>
   </topic>

   <topic id="r-NORWAY" types="r-COUNTRY">
      <topname>
         <basename>Norway</basename>
         <dispname>Norway</dispname>
         <sortname>NORWAY</sortname>
      </topname>
   </topic>

   <topic id="STYLESHEET" types="r-TOPIC-TYPE">
     <topname>
       <basename>STYLESHEET</basename>
       <dispname>STYLESHEET</dispname>
       <sortname>STYLESHEET</sortname>
     </topname>
   </topic>

   <topic id="render-topic" types="STYLESHEET">
     <topname>
       <basename>RENDER-TOPIC</basename>
       <dispname>RENDER-TOPIC</dispname>
       <sortname>RENDER-TOPIC</sortname>
     </topname>
     <occurs scope="XSLT" xlink:href="topic.xsl"/>
   </topic>

   <topic id="render-country" types="STYLESHEET">
     <topname>
       <basename>RENDER-COUNTRY</basename>
       <dispname>RENDER-COUNTRY</dispname>
       <sortname>RENDER-COUNTRY</sortname>
     </topname>
     <occurs scope="XSLT" xlink:href="country.xsl"/>
   </topic>

   <topic id="render-norway" types="STYLESHEET">
     <topname>
       <basename>RENDER-NORWAY</basename>
       <dispname>RENDER-NORWAY</dispname>
       <sortname>RENDER-NORWAY</sortname>
     </topname>
     <occurs scope="XSLT" xlink:href="norway.xsl"/>
   </topic>

   <topic id="XSLT" types="r-TOPIC-TYPE">
     <topname>
       <basename>XSLT</basename>
       <dispname>XSLT</dispname>
       <sortname>XSLT</sortname>
     </topname>
   </topic>

   <topic id="MENTION">
     <topname>
       <basename>MENTION</basename>
     </topname>
   </topic>

   <assoc types="RENDERED-BY_RENDERS">
      <assocrl anchrole="RENDERED-BY" xlink:href="#render-topic"/>
      <assocrl anchrole="RENDERS" xlink:href="#r-TOPIC-TYPE"/>
   </assoc>

   <assoc types="RENDERED-BY_RENDERS">
      <assocrl anchrole="RENDERED-BY" xlink:href="#render-country"/>
      <assocrl anchrole="RENDERS" xlink:href="#r-COUNTRY"/>
   </assoc>

   <assoc types="RENDERED-BY_RENDERS">
      <assocrl anchrole="RENDERED-BY" xlink:href="#render-norway"/>
      <assocrl anchrole="RENDERS" xlink:href="#r-NORWAY"/>
   </assoc>

   <topic id="RENDERED-BY_RENDERS">
      <topname>
         <basename>RENDERED-BY_RENDERS</basename>
         <dispname>RENDERED-BY_RENDERS</dispname>
         <sortname>RENDERED-BY_RENDERS</sortname>
      </topname>
   </topic>

   <assoc types="r-INSTANCE_CLASS">
      <assocrl anchrole="INSTANCE" xlink:href="#RENDERED-BY_RENDERS"/>
      <assocrl anchrole="CLASS" xlink:href="#r-ASSOC-TYPE"/>
   </assoc>

   <topic id="r-ASSOC-TYPE">
      <topname>
         <basename>ASSOC-TYPE</basename>
         <dispname>ASSOC-TYPE</dispname>
         <sortname>ASSOC-TYPE</sortname>
      </topname>
   </topic>

   <topic id="r-INSTANCE_CLASS">
      <topname>
         <basename>INSTANCE_CLASS</basename>
         <dispname>INSTANCE_CLASS</dispname>
         <sortname>INSTANCE_CLASS</sortname>
      </topname>
   </topic>

</topicmap>

The following is the control file specifying that the above styling topic map is to be used with the given information topic map (note that the RENDERED-BY_RENDERS association indicates which topic map is being rendered, which topic map has the rendering associations, which resource is the target for the topic map index and which resource is used to render the topic map index):

<?xml version="1.0" encoding="utf-8" ?>
<!--xmle99-ctrl.xtm

A topic map for associating the rendering of a topic
map using another topic map.

See http://www.CraneSoftwrights.com/#reading for a link to the paper.
-->
<!DOCTYPE topicmap
  PUBLIC "-//TopicMap.org//DTD XML Topic Map 1.0//EN" "xtm1.dtd">
<topicmap xmlns:xlink="http://www.w3.org/1999/xlink">

<!--the topic maps being associated-->

   <topic types="c-TOPIC-TYPE" id="data-XTM">
      <topname>
         <basename>DATA-XTM</basename>
         <dispname>DATA-XTM</dispname>
         <sortname>DATA-XTM</sortname>
      </topname>
      <occurs scope="XTM" xlink:href="xmle99.xtm"/>
   </topic>

   <topic types="c-TOPIC-TYPE" id="render-XTM">
      <topname>
         <basename>RENDER-XTM</basename>
         <dispname>RENDER-XTM</dispname>
         <sortname>RENDER-XTM</sortname>
      </topname>
      <occurs scope="XTM" xlink:href="xmle99-render.xtm"/>
   </topic>

   <topic types="c-TOPIC-TYPE" id="index-XSLT">
      <topname>
         <basename>INDEX-XSLT</basename>
         <dispname>Index XSLT</dispname>
         <sortname>INDEX XSLT</sortname>
      </topname>
      <occurs scope="XSLT" xlink:href="topicmap.xsl"/>
   </topic>

   <topic types="c-TOPIC-TYPE" id="index-HTML">
      <topname>
         <basename>INDEX-HTML</basename>
         <dispname>Index HTML</dispname>
         <sortname>INDEX HTML</sortname>
      </topname>
      <occurs scope="HTML" xlink:href="index.htm"/>
   </topic>

   <assoc types="c-INSTANCE_CLASS">
      <assocrl anchrole="INSTANCE" xlink:href="#c-RENDERED-BY_RENDERS"/>
      <assocrl anchrole="CLASS" xlink:href="#c-ASSOC-TYPE"/>
   </assoc>

   <assoc types="c-RENDERED-BY_RENDERS">
      <assocrl anchrole="RENDERED-BY" xlink:href="#render-XTM"/>
      <assocrl anchrole="RENDERS" xlink:href="#data-XTM"/>
      <assocrl anchrole="RENDERED-INDEX" xlink:href="#index-HTML"/>
      <assocrl anchrole="RENDERED-USING" xlink:href="#index-XSLT"/>
   </assoc>

   <topic id="XTM">
      <topname>
        <basename>XTM</basename>
      </topname>
   </topic>

   <topic id="XSLT">
      <topname>
        <basename>XSLT</basename>
      </topname>
   </topic>

   <topic id="HTML">
      <topname>
        <basename>HTML</basename>
      </topname>
   </topic>

<!--the kinds of associations-->

   <topic id="c-TOPIC-TYPE">
      <topname>
         <basename>TOPIC-TYPE</basename>
         <dispname>TOPIC-TYPE</dispname>
         <sortname>TOPIC-TYPE</sortname>
      </topname>
   </topic>

   <topic id="c-INSTANCE_CLASS">
      <topname>
         <basename>INSTANCE_CLASS</basename>
         <dispname>INSTANCE_CLASS</dispname>
         <sortname>INSTANCE_CLASS</sortname>
      </topname>
   </topic>

   <topic id="c-RENDERED-BY_RENDERS">
      <topname>
         <basename>RENDERED-BY_RENDERS</basename>
         <dispname>RENDERED-BY_RENDERS</dispname>
         <sortname>RENDERED-BY_RENDERS</sortname>
      </topname>
   </topic>

   <topic id="c-ASSOC-TYPE">
      <topname>
         <basename>ASSOC-TYPE</basename>
         <dispname>ASSOC-TYPE</dispname>
         <sortname>ASSOC-TYPE</sortname>
      </topname>
   </topic>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#data-XTM" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#render-XTM" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#index-XSLT" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#index-HTML" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#XTM" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#XSLT" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

   <assoc types="c-INSTANCE_CLASS" class="assoc" type="extended">
      <assocrl anchrole="INSTANCE" xlink:href="#HTML" class="assocrl" type="locator"/>
      <assocrl anchrole="CLASS" xlink:href="#c-TOPIC-TYPE" class="assocrl" type="locator"/>
   </assoc>

</topicmap>

2.6: Information flows

The diagram below overviews the merging of the information topic map ("Map to be Rendered") with the rendering topic map ("Topic Rendering Associations"), and the generation of the statically navigated visualization of the topic map information ("Final Rendering"). Not shown is the implicit transformation of any of the input topic maps from their specific document model to the XTM document model.

Figure 5: Overview of information flow

Note in the diagram the four main inputs to the process from the user: The Map Rendering Association tells the system which topic maps are to be merged to create the Merged Topic Map for Rendering.

The Topic-specific Imported Stylesheets are the resources supplied by the user for the specialized rendering of each of the topics according to the occurrences in the Topic Rendering Associations. These are written as standalone stylesheets with a template rule for the <xtm:topic> element as it lives within a complete XTM topic map.

The environment's automation is captured in the XSLT scripts noted at the top of the inside box: The Topic Map Merge Scripts (merge-combine.xsl and merge-prune.xsl) pull the input topic maps into a single merged topic map. The Stylesheet Generation Script (make-render.xsl) creates the importing stylesheets, one for each stylesheet referenced in the Topic Rendering Associations, and the batch file that invokes each of these in turn with the merged topic map to produce the final rendering.

Each intermediate Topic-specific Importing Stylesheet created by the Stylesheet Generation Script generates as many result renderings as there are topics of a given type. These importing stylesheets import the stylesheets written for each specific topic. Note that the Topic-specific Imported Stylesheets need not know anything about the environment as they are written as if they were rendering only a single topic from the Map to be Rendered topic map.

2.6.1: The environment supporting the data flows

2.6.1.1: Data files and topic maps

The following data files are used in the environment, in the following order:

xmle99.xml
- Michel Biezunski's XML Europe '99 topic map
xmle99-render.xtm
- rendering topic map associating topics from xmle99.xml with stylesheets for rendering each topic
xmle99-ctrl.xtm
- topic map associating the converted topic map made from xmle99.xml with the topic map of rendering associations xmle99-render.xtm

2.6.1.2: Transformation scripts

The following scripts are used in the environment, in the following order:

mb2xtm.xsl
- convert instance of Michel Biezunski's XML Europe '99 topic map to an instance of the draft XTM topic map
validhref.xsl
- validate that XLink href= attributes point to elements with associated id= attributes
- validate no two elements have the same id= attribute value
copyall.xsl
- used by validhref.xsl for copying content
merge-combine.xsl
- step one of two-step process to merge stylesheets according to control topic map associating the topic maps to be merged
- combines the topic maps ensuring resulting unique identifiers remain unique
merge-prune.xsl
- step two of two-step process to merge stylesheets according to control topic map associating the topic maps to be merged
- prunes the input topic map of duplicate topics, adjusting references to removed topics to point to remaining topic
make-render.xsl
- synthesize the invocation batch file for all stylesheets used in the environment that are associated with topics to be rendered
- synthesize the importing stylesheets that manage the creation of result files for all topics sharing the use of the imported rendering stylesheet

2.6.1.3: Support files

The following files are support files used in the environment:

readme.txt
- information file
run.bat
- invocation of all steps of the process in the experimental environment
xtm1.dtd
- version 0.2 of the XTM Topic dated 2000-09-19

2.6.1.4: Stylesheets used for rendering

The following stylesheets are used in the environment to render the topic map:

common.xsl
- common facilities for identifying the names of constructs and other low-level functions
- all constructs are named in the experimental namespace
topicmap.xsl
- the rendering of the master index of the resulting topic map rendering
topic.xsl
- the rendering of any topic for which a rendering is not defined
country.xsl
- the rendering of all country-oriented topics (the topic "country" and those specific topics (countries) inheriting "country" in a supertype relationship)
  - for example, the "France" topic does not have a specific associated stylesheet, so the "country" topic is used, while the "Norway" topic does have a specific associated stylesheet, so the "country" topic is ignored (though both inherit "country")
- to save time, this merely imports the topic.xsl stylesheet changing the background color to prove it is, indeed, a different stylesheet
norway.xsl
- the rendering of the Norway topic
- to save time, this merely imports the country.xsl stylesheet changing the background color to prove it is, indeed, a different stylesheet

2.6.1.5: Working directory

The work\*.* directory is created as part of the run process and includes files that are either copied from their original locations or synthesized by the running of the environment.

2.6.1.6: Resulting files

The result\*.* directory and index.htm file therein are created as part of the run process and is where the resulting topic map is viewed:

main topic and association index for the resulting rendering
named in the rendering association file, not hardwired in the environment
note that all files needed by the topic map are copied into the result\ directory before the topic index and files are created

3: What happened

3.1: Everything ended up working

The architecture worked generally as initially planned, though changes were necessary to accommodate limitations in XSLT and improve on uneconomical execution time. The environment successfully generated a rendering of all topics of the topic map with individualized presentations based on the topic, or the topic type, as appropriate for the definition of the topic.

XSLT was used everywhere where a transformation was required and the scripts ended up employing many different XSLT facilities in the process. There was enough development time to finesse the invocation of the topic stylesheets by using importation rather than repeated invocation of the XSLT processor.

This experimental environment isn't production ready, nor totally finished, nor totally debugged, but it can be useful for visualizing a topic map in a static collection of hyperlinked HTML pages and proves the concept of the automation of the rendering using XSLT stylesheets.

3.2: The Topic Map document model used

The principles demonstrated and tested in this experimental environment are not impacted by which document model is in use as the document model for the core topic maps in the hub-and-spoke architecture.

Given the deadlines for submission of the paper, the document model used for the XML-based topic map common hub document model is based on version 0.2 of XTM published 2000-09-29 within the TopicMap.org development group. This version is based on XLINK-aware internal hyperlinks that are not recognized by XML processors.

The fact that ID/IDREF is not used for topic referencing required a separate validation step which was also written in XSLT. The validation step produces a non-namespace transformed instance which could be validated by a modified document model, which was originally done during experimentation, but it was decided that it wasn't adding anything more than the customized XSLT validation.

3.3: Topic Map conversion

This turned out to be a fairly straightforward process, transmuting instances conforming to a DTD used by the creator of the topic map into instances conforming to the environment's topic map document model. XSLT is used with a stylesheet created for each supplied topic map to convert the given topic map to the common hub structure.

A topic map created for speaker information of the GCA XML Europe 1999 conference was contributed to these experiments by Michel Biezunski of InfoLoom Inc. (see http://www.infoloom.com/tmsample for details). This map does not include any explicit instance/class associations between topics, only implicit associations indicated by the use of the types= attribute.

A topic map created in 2000 for operatic information was contributed to these experiments by Steve Pepper of Ontopia AS in Norway.

A topic map created in 2000 for XML tool information was contributed to these experiments by Lars Marius Garshol also of Ontopia AS. This map was not supplied with a document model.

The author very much appreciates the help received by these prominent members of our community with these contributions.

The rest of this paper references the experiments done with the speaker information topic map. Explicit associations inferred by types= attribute are added to the topic map by the XSLT transform created by the author that transforms the input to XTM format. This was initially thought to be necessary to the experimental environment, but to save development time the processing of instance/class relationships was done implicitly through the attribute.

3.4: Topic Map merging

Using XSLT turns out to be quite awkward to accomplish topic map merging in a single pass. The built-in hierarchically-oriented facilities available in XSLT are not of very much use in the very flat web of hyperlinks present in the interchange model for topic maps.

Early indications are that there are no inherent barriers to accomplishing polished Topic Map merging, but these indications also leave the impression that XSLT is perhaps not the most appropriate tool for the task at hand. The scripts used in this experimental environment do not prune as much duplicate information as would be appropriate in a production environment. A third or possibly more subsequent XSLT processes might be necessary to remove all redundant information, though not doing so did not impact on the experiments.

To increase the utility of XSLT, any number of topic maps are merged through a two-step process in order to take advantage of what built-in XSLT facilities are useful. In particular, the main source document information captured by the XSLT processor at load time (such as XSLT keys) cannot be accessed when using externally referenced documents, thus necessitating the main input be the raw merged information. The topic map inputs to the first step of the two-step process are modified during merging to ensure that identifiers unique in each file but used in both files are each unique in the merged file.

In place of ID/IDREF, the XSLT key facility is utilized extensively to manage the location and access to individual topics. Common topics found in all of the input topic maps first through any present identity attributes and, if not present, through identical values of the topic base name in the same scope. All associations from all topic maps are copied to the output with any references to common topics modified to point to the actual topic that is output from the process. The use of XSLT keys instead of the built-in facilities in XSLT for ID/IDREF also negates the need for having a document model to describe which attributes are of type ID.

The control of the merging process is itself a topic map, the Map Rendering Association map, associating the topic maps to be rendered with the topic maps describing the associations of stylesheet files with topics. If there is only a reference to the topic map being rendered, without any associations to topic rendering information topic maps, then it is assumed the topic map being rendered already has sufficient information for rendering (or perhaps accepts default rendering for all information).

3.5: Topic Map rendering

3.5.1: Rendering inheritance for individual topics

It became apparent early on that needing to create as many stylesheets as there are topics would be burdensome, so it was decided to rely on the implicit instance/class inheritance found in the types= attribute in the topic map.

A topic in the merged topic map is rendered by the associated stylesheet specified through a "RENDERED-BY_RENDERS" association as a occurrence of the RENDERED-BY topic using the resource identified in the scope of the XSLT topic. If no such "RENDERED-BY_RENDERS" association exists for a given topic then the process follows the chain of types= attributes to find the closest inherited topic that does have a "RENDERED-BY_RENDERS" association in XSLT.

To illustrate this in the experimental environment, almost every topic in the speaker information topic map is rendered using the default rendering for a topic. The country topics are all given a single rendering by associating the country topic with a stylesheet. The Norway topic is itself given a different stylesheet for rendering by associating yet a different stylesheet file.

To save time in development, both the Norway and country stylesheets specify unique background colors and stylesheet names at the top of the rendered page and then just import the default stylesheet. This accomplishes the goal of getting distinctive renderings but doesn't illustrate how flexible one can be using the same environment with a number of vastly different stylesheets.

3.5.2: Rendering independence

The architecture does not impose restrictions on the rendering information being maintained in the rendering topic map. An XSLT-oriented rendering stylesheet can be but one occurrence of many rendering scripts for different technologies making the rendering topic map (and this entire architecture as a model) useful in many environments.

3.5.3: Automation

The environment implements automation by generating a batch file that controls the application of associated stylesheets to different topics.

Initially, the design considered somehow invoking the XSLT processor once for each topic with the given XSLT stylesheet for that topic. When considering hundreds or thousands of topics, this is entirely untenable and ways were sought to reduce the unmanageable burden.

The design then considered somehow invoking the XSLT processor once for the entire topic map and generating every possible topic output in a single run. Unfortunately, this puts a burden on the writing of the individual topic rendering stylesheets by requiring the names of constructs to not conflict. Even if a name disambiguation scheme were developed, the environment would need to somehow invoke the processing of a given xtm:topic node in a multitude of different ways, which is typically accomplished using template rule modes. Arbitrarily introducing modes into <xsl:apply-templates> and <xsl:template> instructions by transforming the authored stylesheets into environment-friendly stylesheets becomes untenable when the original stylesheet authoring calls out shared stylesheets with globally scoped variables. Even if one could consider transforming every called out stylesheet fragment once for every stylesheet callout into a unique stylesheet with all template rule constructs modified to unique modes to engage unique scope, the globally scoped variables cannot be scoped in the same fashion. An importation scheme would lose variable declarations of lesser importance.

The final design was able to take advantage of the <xsl:apply-imports/> facility in XSLT. A importation stylesheet is synthesized for every XSLT stylesheet utilized in the rendering topic map. Each synthesized stylesheet processes all of the topics that use the given rendering stylesheet, producing one rendered result for each xtm:topic node, by importing the rendering stylesheet. The synthesized importing stylesheet is kept totally independent of the rendering stylesheet by using the <xsl:apply-imports/> instruction which reprocesses the current node in context using only imported stylesheets.

The only drawback in the final design was that the input topic map could not be processed only a single time. The input topic map has to be processed as many times as there are stylesheets specified for rendering. Creating the XPath data model multiple times for an input topic map of thousands or hundreds of thousands of topics would be a burden on processor resources and on time.

3.5.3.1: Automation environment

The experimental environment uses MSDOS and assumes the use of the Saxon XSLT processor, but this could be changed for any environment and any XSLT processor.

The current state of the automation generating stylesheets is not parameterized, but only for lack of development time. In a production environment this could be easily parameterized for different command line and XSLT processor environments.

4: Lessons learned and tips used with XSLT

There were a number of issues raised in the authoring of the transformation scripts that make the body of the experimental environment. Most of these issues are related just to awareness of the XSLT technology and are highlighted here to help the reader avoid the same problems discovered by the author. These problems are not problems with the technology, but just problems with the way the author approached the writing of the stylesheets.

4.1: Non-prefixed names in XPath

It is important to remember that non-prefixed names in an XPath expression refer only to nodes of the source tree that are not in any namespace. Even if the stylesheet should define a default namespace for use by non-prefixed element type names in literal result elements, the default namespace is not used by the components of an XPath expression.

It is very common to write the following as XPath expressions:

.... select="//topic" ....
.... match="topic" ....
.... select="topname[1]/basename" ...

In all of the above cases, the expressions are referring to element types that are not in any namespace at all, not to element types that are in the default namespace. Since element types in XML Topic Maps are in the "http://www.topicmap.org/xtm/1.0" namespace (remember that the prefix is irrelevant in XSLT), a namespace prefix must be supplied in the XPath expression as follows:

.... select="//xtm:topic" ....
.... match="xtm:topic" ....
.... select="xtm:topname[1]/xtm:basename" ....

It was a common misstep by the author to forget a prefix in a location path expression, most often in multiple step paths as follows:

.... select="xtm:topname[1]/basename" ....

The problem to the author is that the above is a valid XPath expression and there is no reason for an XSLT processor to reject or flag an expression utilizing both namespace qualified components and components not in any namespace.

4.2: Reducing namespace declarations in the result

The first versions of the merging stylesheets synthesized the result tree document element along the lines of the following use of a literal result element or other uses of the <xsl:element> instruction:

<xsl:template match="/>
  ...
  <xtm:topicmap>
    <xsl:apply-templates select="//topic"/>
  </xtm:topicmap>
  ...
</xsl:template>

The serialization of the document element produced by the above code will not include any namespace declarations utilized in the document element of the source. This requires the processor to emit possibly numerous namespace declarations to ensure one is in scope for every use in sub-elements of the given document element. In particular, the "http://www.w3.org/1999/xlink" namespace ended up getting declared within every topic of the result though it was only declared once in the input topic map at the document element even though the document element has no explicit need for it.

The sheer bulk of the serialized result was reduced by copying the document element from the source to the result instead of synthesizing the result document element. In the following fragment, all attached namespace nodes of the input document element are copied to the result tree for serialization:

<xsl:template match="/xtm:topicmap" priority="1">
  <xsl:copy>          <!--result has the same document element as the source-->
    <xsl:apply-templates/><!--it's everything else that's possibly different-->
  </xsl:copy>
</xsl:template>

4.3: Conditional expressions in XPath

There is no conditional expression construct in XPath yet for some areas in the algorithm it would have been very useful. Consider the need to choose between a base name and a sort name since the sort name is optional:

  <xsl:choose>
    <xsl:when test="xtm:topname[1]/xtm:sortname">
      <xsl:value-of select="xtm:topname[1]/xtm:sortname"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="xtm:topname[1]/xtm:basename"/>
    </xsl:otherwise>
  </xsl:choose>

The above would select between the two, with priority to the sort name, but couldn't be used in the criteria for an <xsl:sort> instruction attribute that needs to sort all topics by either the sort name, if present, or the base name if the sort name is absent. The following will accomplish the desired activity in an XPath expression:

  <xsl:for-each select="//xtm:topic">
    <xsl:sort select="( xtm:topname[1]/xtm:basename |
                        xtm:topname[1]/xtm:sortname )[last()]"/>
    ....
  </xsl:for-each>

The above expression considers that if both are present then the last one is always the sort name, and that if the sort name is absent at least the base name will be there (both of which are true according to the document model of XTM Topic Maps at the time).

4.4: Avoiding collision in named constructs

It was desirable that the shareable stylesheet fragments written for the environment be usable by those who obtain the experiments for their own exploitation. Naming constructs is often difficult because any name chosen that is meaningful is a candidate for being already in use by someone wanting to introduce the stylesheet fragment.

XSLT allows construct names (key tables, variables, named templates, etc.) to be namespace qualified to guarantee no collisions with other constructs that might already exist when new constructs are introduced.

The following fragment is from topicmap.xsl:

<xsl:stylesheet xmlns:xtm="http://www.topicmap.org/xtm/1.0"
                xmlns:xlink="http://www.w3.org/1999/xlink"
                xmlns:expenv="http://www.CraneSoftwrights.com/ns/expxslt"
                exclude-result-prefixes="expenv"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

<xsl:import href="common.xsl"/>   <!--definition of expenv: constructs-->
....
                         <!--display the topic's name in the hyperlink-->
              <xsl:call-template name="expenv:topic-name"/>
....

The following fragment is from common.xsl:

<xsl:stylesheet xmlns:xtm="http://www.topicmap.org/xtm/1.0"
                xmlns:xlink="http://www.w3.org/1999/xlink"
                xmlns:exp="http://www.CraneSoftwrights.com/ns/expxslt"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                exclude-result-prefixes="exp"
                version="1.0">
....
    <!--display a topic's name, the display name takes precedence, but if not
        present, the base name is used in its stead-->
<xsl:template name="exp:topic-name">
....

"{http://www.CraneSoftwrights.com/ns/expxslt}topic-name" is the effective name of the template in both files because the namespace prefix is irrelevant between stylesheet fragments. This does assume that no-one else is already using Crane's domain in their namespace URI values for their own (nefarious?) reasons.

4.5: Text no-operation instructions

Sometimes the volume of markup needed to express what is desired detracts from the maintainability and legibility of stylesheets. Consider the following fragment:

<xsl:template name="exp:assocrl">
  <xsl:call-template name="exp:href"/>
  <xsl:text> (</xsl:text>
  <xsl:value-of select="@anchrole"/>
  <xsl:text>)</xsl:text>
</xsl:template>

This can be rewritten more succinctly as follows:

<xsl:template name="exp:assocrl">
  <xsl:call-template name="exp:href"/>
  <xsl:text/> (<xsl:value-of select="@anchrole"/>)<xsl:text/>
</xsl:template>

Note how the instructions themselves add nothing to the result tree but their presence in the stylesheet tree limits the length of the text node containing non-whitespace characters, thus producing the desired result on a single line of the stylesheet instead of three.

4.6: Stylesheets writing stylesheets with legible results

The synthesized stylesheets created by the environment to import the topic-specific rendering stylesheets are serialized like all other XML outputs. Any whitespace-only text nodes in the creating stylesheet between the literal result elements destined for the created stylesheet are lost producing a difficult to read result.

Using indent="yes" produces a result according to the preferences built in to the XSLT processor and are both not in control of the stylesheet writer and possibly not even supported by the processor in the first place.

Initial attempts at creating and controlling the emission of whitespace-only text nodes to produce the desired indentation of the result were awkward, messy and difficult to support. The easily supported method was as follows:

  <xsl:variable name="result" xml:space="preserve">
<xslt:stylesheet version="1.0"
                 xtm:dummy="http://www.topicmap.org/xtm/1.0"
                 saxont:dummy="dummy"
                 extension-element-prefixes="saxon">
...
<xslt:template match="/">
  <xslt:call-template name="do-next-topic">
    <xslt:with-param name="topic-ids"
      select="concat( normalize-space( $topic-ids-needing-stylesheet ) ,' ')"/>
  </xslt:call-template>
</xslt:template>
 ...
</xsl:variable>
...
  <xsl:copy-of select="$result"/>         <!--copy of variable to the result-->

Note that the use of xml:space="preserve" preserves all descendent whitespace text nodes in the stylesheet tree. When the above variable is added to the result tree, the shape of the markup is preserved (though whitespace between attribute specifications is not preserved since it is never present in an XPath tree).

4.7: Parameterized processing of source node trees

To make the identifiers unique in the result of all the topic maps being merged, parameterized template rules were very useful to carry unique information associated with each source node tree. The guaranteed unique information used is the generated identifier for the root node of each tree:

    <xsl:variable name="rendered-root"
                  select="document( $rendered-file, / )"/>
    <xsl:variable name="rendered-by-root"
                  select="document( $rendered-by-file, / )"/>

                             <!--add in the topic map that is to be rendered-->
    <xsl:for-each select="$rendered-root">
      <xsl:apply-templates select="/*/node()">
        <xsl:with-param name="prefix"
                        select="concat(generate-id($rendered-root),'-')"/>
      </xsl:apply-templates>
    </xsl:for-each>

     <!--add in the topic map that describes how the rendering is to be done-->
    <xsl:for-each select="$rendered-by-root">
      <xsl:apply-templates select="/*/node()">
        <xsl:with-param name="prefix"
                        select="concat(generate-id($rendered-by-root),'-')"/>
      </xsl:apply-templates>
    </xsl:for-each>

        <!--add in the topic map that ties the two other topic maps together-->
    <xsl:apply-templates select="/*/node()">
      <xsl:with-param name="prefix"
                      select="concat(generate-id(/),'-')"/>
    </xsl:apply-templates>

The following would utilize the preserved value in the recreation of attribute nodes:

<xsl:template match="@id | @scope | @types">
  <xsl:param name="prefix"/>
  <xsl:attribute name="{name(.)}">
    <xsl:value-of select="$prefix"/>
    <xsl:value-of select="."/>
  </xsl:attribute>
</xsl:template>

Note that built-in template rules cannot be relied upon because they do not support the passing of any parameters. The following was included in the stylesheet to ensure the built-in rules would not be engaged:

           <!--copy all elements in the tree, possibly with descendent nodes-->
<xsl:template match="*">
  <xsl:param name="prefix"/>
  <xsl:copy>
    <xsl:apply-templates
                       select="text()|comment()|processing-instruction()|*|@*">
      <xsl:with-param name="prefix" select="$prefix"/>
    </xsl:apply-templates>
  </xsl:copy>
</xsl:template>

            <!--copy all leaves of the tree; no descendent nodes of any kind-->
<xsl:template match="text()|comment()|processing-instruction()|@*">
  <xsl:copy/>
</xsl:template>

4.8: Protecting stylesheets from being used in the wrong context

The topic rendering stylesheets are expected to be imported by the synthesized stylesheets generated by the system. To prevent the authored stylesheets from being used out of context as a standalone stylesheet, the following template rule for the root node was defined (overridden by the importing stylesheet's template rule for the root node):

<xsl:template match="/">
  <xsl:message terminate="yes">
    <xsl:text>This stylesheet is not designed to process an entire </xsl:text>
    <xsl:text>source tree.  Rather, it acts on a single topic in a </xsl:text>
    <xsl:text>context defined by an importing stylesheet.</xsl:text>
  </xsl:message>
</xsl:template>

To ensure the transformation tasks in the environment did not act on unexpected input, the following was used in each of the merging stylesheets:

<xsl:template match="/"> <!--prune epilogue and prologue from source-->
  <xsl:apply-templates select="*"/>
</xsl:template>

<xsl:template match="/*">
  <xsl:message terminate="yes">
    <xsl:text>Unexpected document for merge combination request</xsl:text>
  </xsl:message>
</xsl:template>

<xsl:template match="/xtm:topicmap" priority="1">
  <xsl:copy>
    <xsl:apply-templates/>
  </xsl:copy>
</xsl:template>

Note the use of priority above is mandatory because the latter two match expressions both have an implicit priority of .5 and therefore conflict.

5: Conclusions

5.1: Desired changes to the topic map document model

The anchrole= attribute of <assocrl> element does not provide a lot of utility and it would be better if the user were constrained to characterize the role of an anchor by using a topic.

5.2: Desired changes to XSLT

The key() function currently works with simple string values. If it were possible to tokenize the lookup string into individual name tokens, then the key() function could implement the equivalent of IDREFS without resorting to recursive called named templates. Although doable, he effort involved exceeded the available time for proving the concept of the experimental environment so the scripts developed do not support multiple inheritance of class types.

The lack of a conditional expression in XPath could be satisfied in a few critical areas of XSLT if certain constructs could use template evaluation in place of attribute evaluation. The key() function could use a template in place of the use= attribute. The <xsl:sort> instruction could use a template in place of the select= attribute.

The author has already forwarded the above suggestions to the XSL working group at W3C.

There are already changes planned for XSLT that will improve the portability of the scripts written for this experimental environment. In particular, multiple file emission is currently hardwired for Saxon and could be replaced with the anticipated equivalent facility in XSLT 1.1.

5.3: Performance

The achievements that reduced the number of invocations of the XSLT processor from the number of topics to the number of stylesheets used by the rendering topic map reduced the time a great deal, but no other strategies could be found at the time to lessen the burden of loading the entire topic map for each process.

Given that topic maps can be very large instances the impact of loading the entire source tree into an XPath data model is significant enough to perhaps eliminate consideration of XSLT altogether for large topic maps. The model embodied by these experiments, however, is not limited just to the use of XSLT.

One consideration is possibly to load the entire topic map once and write out the subsets of the maps for each stylesheet, and then run the stylesheet process on all members. The drawback of this is that should the rendering involve traversing some of the links emanating from a given topic, those links might be broken in split maps.

5.4: Summary

The concept works and XSLT can be used everywhere, but perhaps XSLT isn't the best choice for navigating the flat web of hyperlinks in a topic map as required by topic map merging. The inherent facilities in XSLT are hierarchically oriented and the extensive requirement for using recursive called templates may make a programming language a more appropriate choice. Stylesheet writers familiar with functional languages such as LISP could more easily maintain some of the internal scripts in the experimental environment than writers unfamiliar with the techniques.

However, with the environment's scripts working and debugged, the external stylesheets to render individual topics are as simple or as complex as desired by the writer. The flexibility inherent in the topic map merging of the map to be rendered with an external specification of the rendering desired gives users of topic maps the ability to easily produce alternative projections of the topic map content.

6: Continuing Effort

These experiments are continuing past the deadline for publication of this paper. Updates to this paper from this continuing effort can be found through the "Recommended Reading" section of the author's company's web site http://www.CraneSoftwrights.com/#reading.

All of the scripts and samples are made available through the free resources section of the company web site at http://www.CraneSoftwrights.com/links/res-xsltexp.htm. With this freely available environment, anyone can create their own static renderings of an XTM Topic Map with their own stylesheets.

To use these scripts yourself, create stylesheets to use for rendering topics from your topic map, create a rendering topic map associating these stylesheets with the topics of your topic map, associate your rendering topic map with your topic map in a control topic map, then run the control topic map in this environment.

And keep your eye on updates to this resource that will support the final Version 1.0 flavor of the XTM Topic Map document type definition.

Experiments Using XSLT With Topic Maps
G. Ken Holman
$Date: 2000/12/06 16:52:26 $(UTC)

Experiments Using XSLT With Topic Maps

Table of Contents