Abstract
Content Management is a proven application of XML. The practice benefits from the 20 years of collective experience in using SGML in structured document applications. During the last four years, the World Wide Consortium (W3C) has released a number of XML related specifications to address some of the deficiencies of XML1.0, and make new functionalities available to developers. At the same time, many developers, open source communities, and industry groups have been contributing new ideas to the XML movement. These new ideas, sparked by the convergence of the "document-centric" and "data-centric" views of XML, are reshaping the way we manage enterprise content. They will result in significant cost savings and more robust and scalable solutions.
Keywords
Table of Contents
Business objectives drive the launch of an enterprise XML content management systems. They are focused on the following perspectives: financial, customer, internal and external business processes, and employee. Financial and non-financial performance measures are created to manage and evaluate the project. It is important to involve all stakeholders to create strategic alignment and determine the common requirements for achieving the business objectives.
From the financial perspective, the measures could be the Return on Investment (ROI), or a broaden revenue mix made possible by the flexibility of XML.
From the customer perspective, the new system could result in an increase in customer satisfaction and retention. For example, a new XML content management system can help a pharmaceutical publisher deliver drug information to physicians through a variety of channels including print, Personal Digital Assistants (PDAs), and e-learning format. The XML data can also be integrated into hospital systems, by taking advantage of the vendor and platform neutral characteristics of XML.
From the internal business processes perspective, the measures could be quality, costs, response time, and the rate of new products introduction. XML improves the quality of enterprise content by structuring the content, and by imposing and validating constraints on its structure. XML also enables the quick re-purposing of enterprise content to respond to new demands or the emergence of new delivery platforms.
Today, emerging Enterprise Application Integration (EAI) solutions are focused on service-oriented architectures. Web Services technologies like UDDI (Universal Description, Discovery and Integration), WSDL (Web Services Description Language), and SOAP (Simple Object Access Protocol) are all based on XML. By migrating its content to XML, the enterprise will be prepared for a seamless integration with outside partners.
Finally, the new system could result in better employee satisfaction, and increased productivity.
The use of object-oriented modeling techniques with UML (Unified Modeling Language) is emerging as a method for optimizing new XML applications [Carlson]. UML use cases and activity diagrams are created to model system and process requirements. They help visualize and document the interaction among the stakeholders, and the role every one of them plays in the process. The initial project implementation plan should list project deliverables and timeline.
XML is a new technology to many enterprise developers. Sometimes, migrating to XML is also a paradigm shift for developers. The shift occurs for example from relational data to hierarchical data structures or from procedural to functional programming with XSLT. At the beginning of the project, it is important to provide the appropriate level of training for each group of stakeholders. XML skills can make the difference between an XML project's failure and success. The lack of XML training is usually a source of frustration in development teams. The development team needs hands-on training on XML syntax, data modeling, and XML Schema design. In addition, depending on your data processing requirements, developers need to master the essentials of SAX, DOM, and XSLT. The training program should be designed to close specific skills gap, and it should be delivered just in time. Today, there are multiple delivery options including instructor-led classroom, e-learning, on-the-job mentoring, and a blended approach.
The authoring group is a key stakeholder community. However, at this stage, they only need a basic understanding of structured data principles, and a high-level overview of how XML authoring differs from traditional word processing or data entry applications. Their full participation and inputs are required for the data modeling process. The members of the authoring group are often subject matter experts (SMEs). Examples include pharmacists in pharmaceutical publishing or aircraft maintenance engineers in aviation publishing.
This step of the project requires a lot of attention and careful analysis. You need to model not only the data, but also the context or business processes that move the data. Data modeling clarifies the meaning, the constraints, and the structural relationships between the information items. The model should satisfy the functional and technical requirements including media independent publishing, reuse and re-purposing. The data model is documented in the form of UML class diagrams, and the business processes as collaboration and sequence diagrams [Carlson]. Stakeholders review the diagrams and their feedback is incorporated in the design.
You can map UML class diagrams to XML Schemas or DTDs. The XML Schema specification provides better facilities for specifying data types and using namespaces. If you plan to expose the functionalities of your content management system as Web Services, keep in mind that the WSDL (Web Services Description Language) specification supports the XML Schemas specification (XSD) as its preferred canonical type system. In modeling the relationships among information items, you should also take advantage of the facilities provided by new XML related specifications like XLink, XPointer, XBase, and XInclude.
When evaluating the use of industry standards, the first thing to remember is that achieving your business objectives is the main driver of the project. You should use industry vocabularies if they support those objectives. You can design an enterprise DTD or XML Schema for creating and managing your content internally, and map the data to industry schemas before exchanging the data with outside partners. Industry schemas tend to be very large and complex. Consider creating a subset of the industry schema that satisfies your requirements. Ambiguity and complexity cause confusion particularly for the authors of the document.
There are a variety of techniques for performing legacy data conversion. You should choose the most effective method, depending on the format of the data and the existing legacy systems. Converting legacy data is traditionally accomplished with an up-translation language like Omnimark or through outsourcing. Up-translation and outsourcing work well if the data is available only in print or in a proprietary word processing format.
If your data currently resides in a relational database management system (RDBMS), it should be possible to easily output the data as XML and transform the XML into the target Schema. Database vendors including Microsoft, Oracle, and FileMaker have added XML output capabilities to their respective product.
Some legacy applications output their data as flat files. There are three general types of flat files: delimited, fixed-width and tagged record. Flat files can be mapped to XML using a variety of techniques including the DOM (Document Object Model). This is done by first transforming the content of the flat file into name-value pairs, and then creating and serializing a DOM tree.
Using an XML Editor is not the only way to capture XML content. XML editors can significantly increase the cost of the project because of the customization requirements through API programming, and the need to train end users on structured authoring. If the data is text intensive in nature, an XML Editor is a good choice for authoring the content.
Otherwise, you should consider other options including the use of an HTML form with client side or server side scripts that generate XML data, based on user inputs. The W3C is working on a new specification called XForm that will provide powerful interfaces to XML applications. The XML Encryption and XML Signature specifications can be used to add security to the authored content. Another W3C working group is developing guidelines for device independence principles in authoring content.
Your Content Management System should support the following minimal requirements:
Chunking
Check-in and check-out
Storage
Versioning
Metadata
Workflow
Security
Reuse.
The XML content management system can serve as a demilitarized zone where content is aggregated from existing enterprise applications such as Enterprise Resource Planning (ERP) and Customer Relationship Management(CRM) systems before being delivered to customers and business partners. In that perspective, the content management system should support an open portal framework and existing development platform including Java 2 Enterprise Edition (J2EE) and .Net.
When shopping for a system, try to evaluate the vendor's past record of success, financial strength, and future plans to support a service-oriented architecture. How long the vendor has been in business is an important factor because you will need technical support down the road.
You should plan for re-purposing the content as well. E-learning is a good way to re-purpose the content, and can drive additional revenue. You can generate E-learning content by mapping your data to the Shareable Courseware Object Reference Model (SCORM) format. SCORM is an XML vocabulary created by the US Department of Defense (DoD) and a group of E-learning vendors to enable interoperability among learning management systems (LMS). For example, if you produce aircraft documentation, the content could be re-purposed to create e-learning courseware for aircraft technicians.
By separating content and presentation, XML allows the same XML data to be published on different devices. The process of media independent publishing is largely driven by XSLT transformations. XSLT transforms XML data into XML and non-XML data. The transformation can be performed on the client side or server side depending on specific design considerations. For print publications, commercial tools that support XSL Formatting Objects (FOs) are getting more robust.
Today, new applications are being designed with a service-oriented architecture. This new architecture is made possible by an XML based protocols stack that includes: the Simple Object Access Protocol (SOAP) [SOAP], the Web Services Description Language (WSDL) [WSDL], and the Universal Description, Discovery, and Integration (UDDI) [UDDI] specification.
To support this new paradigm, the first step is to breakdown the content management application into reusable software components that can be exposed and invoked as Web Services. Enterprise content management functionalities can be exposed as Web Services to drive additional revenue and recoup some of the development costs. Deploying a new XML content management application by aggregating existing third party Web Services can accelerate the development time and reduce costs. The following are examples of content management functionalities that can be exposed or invoked as Web Services:
Content syndication
Data transformation and formatting
Language translation
Distributed content authoring.
The life cycle of a Web Service includes:
The development and testing of the Web Service
The description of the service using WSDL
The publication of the WSDL document to a service registry (UDDI node) or directly to a service requestor
The deployment of a Web Service allowing "find" and "bind" operations to be performed by service requestors.
[Carlson] David Carlson, Modeling XML Applications with UML: Practical e-Business Applications, Addison-Wesley, 2001.
[SOAP] SOAP Version 1.2, W3C Working Draft 9 July 2001, Martin Gudgin, Marc Hadley, Jean-Jacques Moreau, Henrik Frystyk Nielsen http://www.w3.org/TR/soap12/
![]() ![]() |
Design & Development by deepX Ltd. 2002 |