Abstract
This case study outlines the business vectors that led LexisNexis UK to invest in XML-enabled content management software as the first step in a long-range plan to migrate to an XML-based workflow. Initial ROI is shown and the positioning for the next phase of evolution is described.
Table of Contents
Imagine, if you will, that you are a manager responsible for a group of editors and production personnel responsible for editing and producing the information that is at the core of your company's business. You are a legal publisher with a long history in the field. You produce a variety of publications for legal and business professionals, in the form of books, looseleaf works, encyclopedias, reports, periodicals and journals. In addition, there is an online resource for law and tax professionals and consolidated news feeds.
The name of your company is widely known within the UK and, indeed, around the world. You have an established brand that sets high standards for editing, presentation, and ease of access to the information you publish. You must meet and ensure these standards in your workflow because your firm has a reputation to maintain. Your team of editors prepares ongoing updates to publications, incorporating changes to regulations, new legislation, and recent case decisions. They also perform data cleanup and maintain a consistent style throughout the publications.
You have several important business requirements:
Your information must be complete, up to date, and authoritative.
Your information must be delivered in both electronic and printed versions.
You understand the importance of XML markup in processing information for multichannel delivery and see that over time you must adopt an XML workflow.
You must, in the meantime, maintain a current workflow based on Word.
You must justify any changes or investments based on immediate, measurable ROI: improved efficiency, reduced cost or time in the workflow.
Given these constraints, what can you do today?
You know that considerable investment has already been made in the existing workflow. For some time now, all your publications have been managed and maintained digitally. Contributors to the information you edit and publish have, over time, adopted Word as a generally available tool and, realizing the challenges inherent in manipulating and processing the information in an unstructured form your team has invested heavily in developing tools within the Word environment to add structure during the process and to automate many tasks that once had been done by hand. These tools have real value now and cannot be cavalierly discarded.
At the same time you know that soon — perhaps sooner than anyone expects — you will be asked to implement an XML-based workflow.
The first step was to step back and examine the current workflow. A key characteristic of the environment was that long documents — some as long as 1500 pages — were worked on by collaborative teams. There was a need to coordinate this activity, but it was hard to track the status of each document and its pieces and to maintain consistency of style among members of the team. Both managers and team members needed to know who had made what changes when and what the state of the document was at any given moment.
The first decision taken was to address the issue of process efficiency and control of the existing workflow, and to do this in a way that laid the foundation for evolution into the XML world. Given this priority and direction, a team of technical and editorial staff began to look at content management systems that would meet their needs. They surveyed the existing applications from the following perspectives:
It had to coordinate the workflow of the current processes, but be adaptable to the new XML-based workflow.
It had to integrate well with the Word-based tools already developed and be able to store and manipulate Word files and their associated metadata.
It had to be XML-ready — providing out-of-the-box integration with structured editors.
It had to be object-oriented, able to handle multiple object types and to manage documents as collections of related objects.
The team found a content management system that enabled them to define a hierarchy of content objects within each publication that mapped to the current business process and workflow of the organization. Each publication contains its textual content, associated graphics, list of annotations, and frontmatter. This already familiar way of organizing the content adapted well to the existing writing and update cycle of the editors. They now have the capability to track each publication's status and unify processes across multiple publications.
The system also enabled them to use more robust "metadata" or intelligent information about each of the content objects and documents in the system. The metadata is used to link each object in the repository with an appropriate application and editing environment, tracking relevant information about the content such as ownership, linked templates, taxonomy, status, or output formats. This means that when an editor checks out an object, he or she is automatically presented with the content and the tools needed for the task at hand. Metadata also enhances the searchability of content for editing and reuse in other documents.
The system supports the team in managing each publication through the entire editorial and publishing process. For example, the system tracks the editing and revision history of each piece of content. That historical information is available when the document is prepared for delivery. This helps editors to collate updated files for publication.
Phase one of the project has been completed. The team is now using the system in production on several publications and in the next year approximately 200 publications will be managed in the system. More publications will follow after that and all new publications will be loaded and managed in the system.
Several key benefits have been achieved and became apparent in the piloting and movement into production of the system. These were:
User acceptance — much of the familiar tool environment was preserved and the system GUI itself accommodated itself well to users familiar with a Microsoft desktop environment. Users adapted easily to the new environment.
Preservation of investment — The existing investment in editorial tools has been preserved.
Normalization of processes — all members of the team, using a common workflow are now "on the same page". This reduces training costs, promotes the ability of editors to move between projects, and increases accuracy and consistency.
Ease of maintenance and support — Technical and administrative support personnel find it easier to do their jobs because the standardization of process and practices across publications reduces the number of special cases they must handle.
Enhanced capability — the new content management system brings with it new support for editors that will enable them to focus more on content and accuracy. For example, the ability to search on both metadata and full content will soon allow editors to find things more rapidly and easily than before.
The team has, over the past year, made a serious investment in improved processes and content management. It is now well positioned to look at the larger question of migration to an XML-based workflow. They can do this within the managed environment they now have.
As they begin to move to XML, the managed environment they have will continue to work for current publications. They can selectively introduce new publications to an XML-based workflow as they desire. Both old Word-based publications and new XML-based publications can be managed in the same environment with the same user interface. Users are already familiar with it and the ability to track, manage, and report on status will be the same. The system already integrates with best-of-breed structured editing tools such as Corel's XMetaL, ArborText's Epic, and Adobe's Frame.
There is, of course, one big hurdle: moving from the Word-based environment to the XML world. While crossing that boundary is not easy, and there is no single bullet-proof solution to moving from unstructured to structured data, there are two things that will contribute to the success of the next phase of the evolution:
The Word files they have been creating are highly structured.
Tools for moving from Word to XML are becoming more widely available.
Let us consider each of these in turn.
First, as a natural outgrowth of the copy editing, cleanup, and consistency checking that they do, there is much value added in the normal course of their workflow. The existing Word files, once they have been through the editorial process, contain a lot of implicit and explicit structure. This aspect of their workflow is not likely to change: the input they receive from the outside world, for the foreseeable future, will most likely continue to be Word files and frequently highly unstructured ones at that. They have, however, a strong foundation from which to make a conversion into an XML-based world. It's now less a question of whether it can be done and more a question of where in the workflow to do it.
Second, a number of tools have appeared on the market recently that have as their purpose reducing the barrier between Word and the XML world. These include:
WorX from Hypervision — an XML editor plug-in to Word that also enables conversion to XML based on style information
Tagless editor from i4i — another adjunct to Word that enables users to automatically maintain a synchronized XML document
Upcast from Infinity Loop — an RTF to XML converter
XML support in Office 11— at the moment, a future offering in early beta, with a more general beta promised later this year
Each of these technologies provides options concerning how and where and by whom the bridge is crossed from Word to XML in the workflow.
Office 11 offers the hope that more reliable structure will be introduced upstream by the content creators who supply LexisNexis UK with much of the "raw" content that they ultimately publish.
For the foreseeable future the point at which content moves from unstructured to structured format will be in the course of working on it in house, by the content editors who currently add value to the content during the editing and publication cycle. Once they have received and control the content they can choose the appropriate point in the workflow to perform the uptranslate and manage the content from that point on as the canonical structured form. They have already built the infrastructure that enables them to manage their workflow while working on both structured and unstructured content.
Looking out two to four years ahead, the team at our legal publisher is strategically poised to be able to continue to meet their ongoing requirements to ensure uninterrupted quality and a steady revenue stream and to introduce XML into their workflow where and as needed.
![]() ![]() |
Design & Development by deepX Ltd. |