Abstract
The abstract was not available at the time the proceedings were created. Please check an updated version of the paper abstracts at the conference proceedings web site.
Although in macro-economic analysis publishing may be considered one industry sector, we are all aware that the many different types of publishing applications are characterized by significant differences, among them, in business models, marketing, and to the point for today's discussion, production techniques. Additionally, publishing environments themselves impose unique requirements on the workflow. These processes differ between large and small organizations as well as according to the type of work they produce.
Large publishers are more likely to have an in-house production staff. Larger volume means a wider diversity of incoming manuscripts, produced on different software applications including MS Word, Word Perfect, LaTex and others. The large variety of books they offer, math, chemistry, geography, and psychology necessitate special techniques.
A distinct division of work exists between design, production, and editorial departments, wherein a small publisher will have more blurred work responsibility among staff, and will rely more heavily upon outsourcing.
And finally, consider for a moment the production cycles involved in each of these endeavors - they vary enormously. In the case of newspapers, the workflow cycle is centered on a daily production cycle, for academic textbooks it may exceed a year.
We'll discuss in this paper different ways XML-centric workflows integrate into these processes with minimal disruptions and training. We'll consider the software requirements and take a look at some new workflow possibilities enabled by XML and the real world experience of our clients in implementing them.
These publishing applications result in many different types of software solutions providing very different technical requirements for each sector. Medical publishers require cross-referencing, citations, formatting, and bookmarking, which are not at all germane to the jump-story and classified requirements of daily newspapers. Very likely neither organization will have to concern itself with the subtlety of integrating math and chemistry XML as required by college textbook publishers.
Anyone considering the adoption of new technology must be clear about his or her goals for the project. Publishers must be clear and specific about why XML is being utilized or considered. What are you trying to accomplish: is it an increase of productivity in print production, or to decrease cost for outsourced XML. Are you seeking faster transformation of content to alternative media channels, or archiving longevity and vendor independence?
These goals directly relate to the publisher's product and organization. Does the published product lend itself to rules based layout based on XML? Where does this publishers work fit across the spectrum of complexity (seen in the accompanying slide illustration)? Do you work with complex highly designed consumer graphic material? Are you creating textbooks or newspapers, or are Web pages the desired results. ROI is dependent on the answers.
Clearly, Web pages lend themselves most easily to rules-based pagination, with novels, newspapers, academic journals, academic textbooks and consumer magazine at the other end of the spectrum being the most complex. In these cases content strongly influenced the successful (or not) adoption of XML tools.
Another important issue is the organization structure of the publisher. Does the publisher have in-house control of his content generation and production staff, such as a newspaper producer? In the past the answers have determined whether or not the imposed authoring structure and editorial re-training tasks required could be undertaken or not. With an in-house staff it was possible albeit with significant cost. For publishers using outsourced help, external authors or free-lance editorial staff this was not a possibility they could consider.
Among the critical factors in choosing XML workflows are consideration of the most important revenue sources, capacity of the editorial group to adapt to change, the intended reuse of the content and if there can be potential gains in production from automatic layout.
In our view, the publisher will almost always be served well by implementing XML somewhere in their process, even if only implemented as a final archive medium. In this case, for example, the publisher benefits from having easy access to the content.
However to gain the most benefit from deploying XML systems the publisher will want to consider how XML can positively effect his workflow processes. And how the new XML tools blend with current tools and practices will determine the success (measured in ROI) of the project.
For several years we have heard much of the vaunted XML paradigm of "write once- publish many" and seen the following diagram. In this scenario XML is the authoring tool from which flow different outputs (automatically!). While this is theoretically possible, it is practically difficult to deploy in most publishing environments.
The problems encountered were several.
For the most part, XML authoring tools were seen as difficult to use and author unfriendly. They imposed a rigid structure to which the authors were forced to conform during the process of content creation. This was in direct opposition to the way in which creative authors approached their work. The result was wide spread non-adoption of these tools in creative or free-lance publishing environments.
Secondly, the print edition remained the primary revenue source for most publishers. This meant that most workflow processes correctly focused on getting the content "right" for the print edition. This implied a key factor: the final edit was always prepared in the print edition software application, i.e. the page layout tool (most often Quark). And despite software developers' efforts, XML extraction from Quark to complex DTD's has proven to be painful for large volume publishers.
Added to the above factors publisher had to make significant investments to acquire the new XML authoring technology. In conjunction with the above factors it was difficult if not impossible for many publishers to make ROI with XML.
As a result, the workflow we most frequently encounter with publishers is similar to the one you see on the slide - a linear approach. Initial content creation and subsequent early edit are made using a number of different systems such as Word, QPS, InCopy and others. Early templates with dummy text are prepared on the layout side for design approval.
After several processes of clean up and revision, text is sent to layout software, typically Quark, or In Design. At this point the copy is married to the design and final edits are performed directly in the layout software packages. Today, we frequently see processes where paper proof distributions are the norm, with corrections marked up onto the proof, and hand delivered back to the layout software for edit.
When the job is ready for print, the production of the XML starts. The layout software becomes the source document for the XML. The XML is extracted either programmatically, or with manual cut and paste operations typically sourced from overseas production plants.
Today new XML technologies permit publishers to both incorporate XML into their existing workflows as well as benefit from new opportunities. Our view of the most effective general configuration for XML publishing is what we call XML-Centric workflow or the XML star. You see this diagrammed in the accompanying slide.
In this model, any of the diverse applications involved in content creation, revision, or layout should be served by a software application capable of importing or extracting XML and passing it seamlessly to the other applications. The XML technology must provide cross application hosting of the XML ruleset, be managed by a single and hopefully simple administration console, and provide roundtrip XML from the applications involved.
Implementation of this XML -centric process star does not require authors or designers to learn new software applications or impose rigid structures on their creative process. The new XML technologies keep import or extraction of XML under the hood and away from the view of the average users.
In addition to these general requirements, the following specific issues must be addressed:
Precise graphics handling
Parallel processing
Metadata handling
Tables capable including CALS compliance
Resolve styling inconsistencies
ISO Character entity handling
DTD-aware decision making
Valid XML output-capable
Batch process
MathML capability
And while the new tools arriving on the market facilitate the adoption and integration of XML into current processes, at the same time they enable new more cost effective processes, and provide faster turnaround than current methods.
The new technology used must be capable of dealing with the issues of parallel processing. While manuscripts are being prepared and edited, the design templates need to be prepared. All of these processes should be structured around a single DTD and administered by a single console for style mapping across the different applications.
When the initial copy is ready to be flowed to the page the XML tools should be able to use the XML to drive the pagination. This process should be capable of automatic, semi-manual, or interactive pagination.
This new XML star publishing configuration has been in our experience resulted in significant reduction of the total publishing effort in several publishing environments in which we have been involved.
We have seen that authors in fact can become more productive as they use their familiar tools assisted by user-friendly templates, which permit them to accomplish the styling they would have accomplished without the tools, but in a more simplified and direct manner. We have seen that revision cycles are shorter and that accuracy improves.
We have seen that page productivity increases as a result of enhanced parallel processing. Multiple spec designs are more easily generated. This leads to a faster approval cycle and a more generally enhanced design.
We have been working for the last year with a large legal publisher. Previously, the print files were obtained from an outside service bureau using a high-end proprietary composition system. A high cost was involved each and every year when the legal textbook required updating. As the several contributing authors all worked in MS Word, the publisher was required to either pay for conversion to Word from the proprietary composition language, or pay for a re-keying of all the text from the printed page.
The solution to this pay every year was to obtain the composition files from the vendor in XML. Once they were in XML using available XML technology, it is a simple matter to import them into MS Word in a matter of minutes using a standard filter for all the texts, and at a fraction of the cost. The authors delivery time was drastically reduced, resulting in a very significant time to market benefit.
Additionally the legal text now in XML can be very quickly transformed into XML directly from the Word manuscript providing a time-to-Web performance significantly faster than waiting for the print edition to be completed and transformed thereafter.
Using the new XML technology we were able to provide the publisher with a method understood by average users. No XSLT experience or skills are required to obtain the transformations - its simple enough to point to an XML file directory and import the XML to a Word file. And these are not simple text transformations. Legal texts involve significant amounts of complex formatting including tables, nested lists, and cross-references. Using new XML technology, turnaround time for files has gone from a previously required several days to about 30 minutes for average 1200 page legal texts.
Another client, a large publisher of K-12 textbooks, sought to accomplish two goals in a specific project. The publisher wanted to:
Increase productivity in Word to Quark workflows
Obtain XML from Quark for repurposing needs
The publisher particularly did not want to implement any drastic changes on the current workflows. The project had a tight schedule, involved external freelance authors, and internal editorial group, in-house design and outsourced composition service. No members of any of the groups involved had XML knowledge.
Our analysis of the products being produced showed that significant production benefits could be obtained using available XML tools that easily integrated with their current software, Word and Quark XPress.
The layout, although differing significantly from product to product had enough internal consistency to benefit form automated rules based pagination techniques using XML.
We developed a DTD appropriate to the product, and prepared a Word template, which behind the scenes linked styles to the DTD. Authors found it easy to use and it imposed no special training requirements. Extraction of XML from Word occurred with an Export XML command from within Word.
Another tool permitted the design group to prepare standard Quark templates, and associate them with DTD element names. This same tool permitted highly automatic page makeup when the time came to import the manuscripts. As a result production times improved dramatically both in design and by the composition bureau.
As a single set of tools was used to administer the style mapping for both Word and Quark, the correction cycle permitted roundtrip of XML between both applications. Revision times were decreased, and the entire correction cycle shortened. The final edited version of Quark was extracted for final version XML.
We thank you for your attention and open the forum to questions.
![]() ![]() |
Design & Development by deepX Ltd. |