Approach: 2007
Last year's redesign continues to pay off, as the 2007 Proceedings go online with minimal changes to layout or functionality. Only a couple of small enhancements have been made and bugs fixed.
Credits (2007)
Tonya Gaylord was again responsible for editorial preparation of manuscripts submitted by the authors in XML. Again, Wendell Piez oversaw, maintained and ran the production pipeline. Tommie Usdin provided topic indexing to the papers, following (with minor modifications) the scheme designed by Kim Tryka and implemented in 2006.
The tools used this year include Saxonica's Saxon processor (versions 8.9 and 6.5.5), SyncROsoft's
oXygen XML editor/IDE, XSL Formatter from AntennaHouse, Helios Software's Textpad text editor, and various shell scripts and command-line utilities.
As this page grows longer, it is perhaps worth reiterating that production of this site also depends on our authors, who submit their papers in an XML format. They also deserve credit!
Approach: 2006
If you have looked at these pages before you may notice this year a top-to-bottom redesign of the Proceedings site. The look and feel has changed, as well as some of the basic functionality. Our aim has been to improve both the usability and the aesthetics of the Proceedings site, while bringing its code more into line with current web standards, browser capabilities and conformance, and best encoding practices. This is the first major redesign since we originally published these proceedings in 2001; the ease with which this large task was executed, with no resources but time and expertise, continues to testify to the power of XML-based technology.
The largest functional change has to do with the Topic index. We have tried both to rationalize the keywords assigned to papers, and to cluster those keywords in a reasonable, simple, and intuitive taxonomy. This task is far from over and we expect to be making modifications in years to come.
Credits (2006)
We owe special thanks to Kim Tryka for the all-new Proceedings makeover. She designed the new HTML/CSS pages, index pages and topics framework, and wrote XSLT transformations (revised from our existing code base) to generate them. Index pages are now being generated using XSLT 2.0 (by means of Saxon8.7.3b from Saxonica), which has significantly reduced the size and complexity of the code. HTML versions of the papers are still being generated using XSLT 1.0, which remains a viable technology well-suited to this task. PDF versions of the papers use the same XSL stylesheets as last year, run through the AntennaHouse formatting engine.
Editorial work was largely accomplished by Tonya Gaylord, with slight assistance from Kim Tryka and Wendell Piez. Overall production of the Proceedings, including assisting with, integrating and running the new stylesheets, was performed by Wendell Piez under the direction of Tommie Usdin.
Approach: 2005
We removed the keywords index this year, replacing it with a Google site search tool. The keywords had been getting more and more scraggly — author-provided keywords in an emerging field seem to be like that. Without the resources necessary for serious investment in ontology design, authority control, keyword normalization and QA, combined with the fact that the Aggregated Proceedings continues to grow, we decided to take the better part of valor and rely on Google's excellent indexing.
In other respects we continue to run on an infrastructure with minimal changes from last year.
Credits (2005)
Again this year Tonya Gaylord did most of the editorial work, Wendell Piez provided infrastructure engineering and support during final production, and Tommie Usdin gave a bit of spot assistance. Thanks again to our authors and to the folks at AntennaHouse.
Approach: 2004
The 2004 Proceedings have again been produced, on a thinner shoestring than ever, at Mulberry Technologies Inc., using technologies (stylesheets and XML processors) only slightly enhanced from last year's. With limited resources available, we have striven for an editorial and production process as close to "lights out" (no manual intervention) as possible.
Special thanks are due to our authors, without whose work to provide us with valid XML versions of their papers these Proceedings could not appear.
In keeping with our role as canaries (testing bleeding-edge technologies), this year for the first time we allowed authors to incorporate MathML in their papers. Results were mixed: although MathML itself appears sufficiently expressive for their purposes, our authors reported difficulty using it, and current tools for production (formatters and browsers) seem inconsistent in their expectations.
Largely due to these complications, the XML source files offered on the site are now normalized stand-alone versions of the originals, with entity references expanded and no references to a DTD. Accordingly they will now parse in a wider range of tools.
Credits (2004)
Validation and markup correction (where necessary) was provided again this year by Tonya Gaylord; again, production was the responsibility of Wendell Piez. These Proceedings continue to testify to the power of XML, as the heavy lifting was done by our authors and by the developers of the software applications we have used.
Approach: 2003
- As in previous years, submissions to the Extreme 2003 program were collected in XML, tagged according to the extremepaper DTD. (See the Mulberry Extreme web page.)
- We produced the PDF versions of the papers using Antenna House's XSL Formatter. Formatter is a professional solution used worldwide that conforms to the XSL-FO V1.0 W3C Recommendation.
- Files, including graphics files, were renamed according to a consistent format, and tagging errors in the XML were corrected. Index terms (keywords) were added to many papers.
- After light copy editing, with special attention to graphics, the site was generated through batch processing running Saxon and XSL Formatter.
Software we used:
- Plain-text editor
- SoftQuad XMetaL 3.0 XML editor
- Saxon XSLT engine
- Antenna House XSL Formatter
- Various parsers (freely available XML tools), text editors, etc.
Credits (2003)
The HTML web site and PDF versions of the papers were designed and implemented by Wendell Piez, based on last year's work (see below). The papers were edited by Tonya Gaylord, and Kate Hamilton assigned additional keywords to enhance access through the keyword index.
Antenna House provided us with the use of their excellent tool, XSL Formatter. In addition to supporting the W3C Recommendation, Formatter also provides numerous proprietary extension properties to enhance formatting functions beyond those currently provided for by XSL-FO. To download an evaluation copy of Formatter or for more information about Antenna House, please visit Antenna House.
Approach: 2002
- As in previous years, submissions to the Extreme 2002 program were collected in XML, tagged according to the extremepaper DTD. (See the Mulberry Extreme web page.)
- We revised and enhanced the stylesheets we used for processing the papers last year, including the XSLT-based indexing infrastructure. Some design decisions were revisited in an effort to improve the general look and functionality of the site.
- Because in 2002, we did not have the services of a professional typesetter, we determined we needed a production-quality XSL formatting engine. We liked XEP from RenderX, not only because of its standards-compliance and its coverage of the specification, but also because it's in Java. (We're a mixed shop, and like to use software, when we can, that does not lock us into a particular hardware platform.)
- RenderX kindly provided us with an installation of their software for the conference Proceedings -- thanks!
- Files, including graphics files, were renamed according to a consistent format, and graphics were converted into GIF or JPEG format for web display. Index terms (keywords) were added to many papers.
- Having improved the logic of our directory structure, we migrated last year's Proceedings into the new site structure. Since indexes are created by polling the XML data for authors' names, keywords, and other metadata, we were able to integrate the two years with little fuss.
- After light copy editing, with special attention to graphics, the site was generated through batch processing running Saxon and XEP.
Software we used:
Credits (2002)
The HTML web site and PDF versions of the papers were designed and implemented by Wendell Piez, based on work in 2001 by himself and Paul Rosenberg. Thanks to B. Tommie Usdin, Debbie Lapeyre, and Kate Hamilton for invaluable feedback and assistance! Assignment of keywords to the papers was made by Kate Hamilton. The papers were edited by B. Tommie Usdin with the assistance of Debbie Lapeyre, Tonya Gaylord, Wendell Piez, and Kate Hamilton.
RenderX Inc. provided us with the use of their excellent XSL FO formatter, XEP. The conference CD with search engine was provided by NewBook.
Approach: 2001
- Paper submissions to the Extreme 2002 conference were received in XML, using the extremepaper DTD. (See the Mulberry Extreme web page.)
- After acceptance and editorial review, the papers were provided in electronic format to a professional typesetter, Impressions, for composition of print Proceedings for the conference venue.
- Back at Mulberry, a unified naming convention for the files and graphics was imposed. Graphics were converted to GIF (for screen display in HTML) and JPEG (for the XSLFO-generated PDF).
- An overall plan for the papers site was outlined.
- Two people, working largely independently, developed the XSLT stylesheet which generates the HTML version of the papers, and the XSL-FO stylesheet which generates the PDF version of the paper. There was some consultation in order to emphasize the design features characteristic of each tool and to resolve content-handling issues which affected both outputs. For example:
- HTML handles graphics as links, allowing the user to decide whether to download them; PDF displays graphics on the page.
- In the HTML version, a separate page provides for the abstract and table of contents. This allows hyperlinking from the table of contents into the main paper without screen jumping (and without frames). PDF papers provide the paper abstract, but no table of contents (as the papers are generally not long enough to warrant a ToC in print).
- A directory structure for the site was determined and batch processes run to populate it with the generated files.
- Following the conference, many of the authors gave us presentation versions of their papers, including presentation formats such as HTML or PowerPoint, sometimes along with demos they had presented. We dubbed these submissions "author packages".
- In place of papers, we designed "stub" documents when we had an author package, but no paper had been submitted (these papers had been accepted as "late breaking"): this allowed us to provide these papers with titles, authors, and keywords to be picked up with the indexing routines.
- Indexing was provided through a two-step process. First, a list of files containing papers and stubs was submitted to a stylesheet that created a list of "profiles" of the papers. Each profile contains a unique label or code for the presentation, along with its title and all its authors and keywords. Then a stylesheet was run over this synopsis to create the actual indexes with links.
- The compiled synopsis also provided useful information to the stylesheet generating HTML versions, since by consulting it, a stylesheet could determine next and previous papers in the sequence (thus allowing them to be hyperlinked directly).
Content of the papers
It was necessary to improve the XML file content in a few places to deal with factors which the typesetter had taken care of in the page-layout program, or which had not been relevant to page design. These include:
- Wherever possible, <highlight> (a generic emphasis element) was retagged as <url>, <xref>, etc. as appropriate.
- URIs were tagged to generate hypertext links.
- Figure and table references were normalized, and the word "Figure" was removed from figure titles and captions (since it would be generated); we also provided figure titles where the author had not done so, or had put the title in a caption.
Converting to HTML via XSLT
- In order to demonstrate a transformation that would reach the widest range of browsers, we deliberately targeted a "dumbed-down" HTML. Although the use of such means as <font> tags instead of CSS makes for document code that is less clean and easy to maintain, these requirements were judged to be less important, in balance, than a relatively simple utilitarian design that would be broadly accessible.
- The HTML design also makes good use of HTML's hypertext capabilities (for example, in the linked table of contents and the linking of cross-references and figures). Since we were getting print papers by other means, we did not have to worry about creating a web page design that would print.
Converting to PDF via XSL
- As an XSL formatter, we opted for Apache's open-source processor FOP. Due to an obscure bug in Xerces (the default parser) regarding the handling of external unparsed entities, we switched to handle the parsing and transformation through Michael Kay's Saxon processor. This worked well. It is encouraging to see that the potential of standards-based interoperability is being realized (in this case, because of the quasi-standard TrAX API, supported by both FOP and Saxon).
- In version 0.19, FOP is not, however, "industrial grade": in particular it has shortcomings in its handling of keeps (assuring that, for example, a page break does not occur between a section's header and first paragraph), floats (for example, the placement of a graphic, with its caption, in an optimal position on a page), and inline formatting. But we did not expect publication-quality output from this beta package, and present the results in any case as a demonstration of what the tools are capable of at this point in the technology's evolution (mid-2001).
Software we used (2001):
- Plain-text editor
- SoftQuad XMetaL editor (the only commercial product we used)
- Saxon XSLT engine
- Apache FOP XSL formatter
- Various parsers, etc. (freely available XML tools)
Credits: 2001
The HTML web site was designed and built by Paul Rosenberg with assistance from Wendell Piez, based on a standalone Extreme Paper HTML stylesheet by Wendell Piez. The XSL stylesheet to create PDF output via XSL-FO was designed and built by Wendell Piez. Assignment of keywords to the papers was made by Kate Hamilton. The papers were edited by B. Tommie Usdin with the assistance of Tonya Gaylord, Wendell Piez, and Kate Hamilton.
Impressions, Inc. provided typesetting of the print proceedings.