Abstract
How much does it cost to convert a document into XML?
Although the answer could be straightforward, it's not. Though content conversion is fundamental to almost every document-centric XML implementation, ask a number of XML experts to define what constitutes "the content conversion process", and you're likely to get a number of different answers. Each will probably segment the tasks differently, into phases like preparation, development, configuration, conversion, post-processing and editing.
If the phases are combined in different ways for different infrastructure requirements and content characteristics, how can the real cost of "conversion" be commodified?
One approach would be to consider the entire process, from source material, to ready-to-load XML as "conversion", and not classify part of the process as "pre-conversion" nor any resultant editing task as "post-conversion".
When organizations outsource what they deem to be the "conversion" phase, the XML they receive from their supplier may require varying degrees of QA, retagging, or editing. Though they may consider the conversion phase complete at content delivery, any subsequent work must be included in order to determine the true cost of using that conversion method.
Alternatively, organizations that use in-house scripting technologies may not rely heavily on manual QA, but the resources required for development and maintenance of their conversion scripts must be a factor in considering the cost of this approach.
So we can see that the content lifecycle always includes some kind of human interaction, whether during the preparation phase, or just prior to publishing, with varying degrees of effort and skill required.
Extending the definition of "conversion" to include such work highlights human resource as the most expensive part of the overall conversion process. That is not to say that human intervention should be eradicated entirely; it is, for many reasons, generally desirable somewhere along the way.
But a narrow view of "conversion" that only considers the procedure that takes input and creates output, whether manually or machine-assisted, will never reveal the true costs of converting materials. Without taking into account any preparatory steps, verification, or remedial editing work, the comparative costs of different conversion methodologies can be wildly misleading.
This presentation will outline a generalized "end-to-end" process for conversion, starting from source input materials, and concluding with ready-to-load XML content, taking into consideration:
-multiple input formats, including paper
-mixed input media, including text, images, tables, document fragments, aggregate source material
-output structure, as whole documents or fragments
-the intended use of the XML output
-ongoing maintenance and publishing of content
The benefits and drawbacks of different conversion methodologies will be discussed in this context, along with the costs associated with each stage of conversion:
-document preparation
-configuration
-automation
-content editing
-post-processing
-verification/QA
leaving attendees with a much stronger understanding of the price they are really paying for XML conversion.
Keywords
![]() ![]() |
Design & Development by deepX Ltd. |