The Impact of XML on the Processes and Efficiencies of the Federal Government

Keywords: Authoring, CALS+, GPO Table Tool, Data Conversion, Content Management, Content Repurposing, Data Interchange, Documentation, E-Government, Electronic Publishing

Mr Edward Schulke
Chief Information Engineer
DataStream Content Solutions, LLC
College Park
Maryland
United States of America
eschulke@dscs.com

Biography

Ed Schulke is the Chief Information Engineer at DSCS and manages a staff of project managers, programmers and data analysts on a variety of data conversion and content management projects. He has developed XML feeds for a number of Federal Government data sources and has lead conversion projects for the United States House of Represenatives and Senate, Congressional Quarterly and LexisNexis. His resume is available at: Ed's Resume. Ed has been working on the creation of a generic XML mark-up language he has named, UEML for Universal Exchange Markup Language. It has been developed to provide the ability to create and exchange XML information in a neutral format and is a core component of DataStream's conversion methodology.


Abstract


The focus of this paper and the presentation will be to discuss how XML has changed and improved the legislative and regulatory document creation and management processes for agencies of the federal government. During the presentation, we will briefly describe the evolution of XML adaptation in the Legislative Branch agencies. A more in depth discussion can be found at xml.house.gov.


Table of Contents


1. Legislation in XML
2. Tabular Data Creation Tool
3. GPO Table Tool
4. HOLC Outline Tool - Amendment Instructions
5. Conclusion

1. Legislation in XML

DSCS has worked extensively with the US Senate and House of Representatives on a variety of legislative data projects. The company provided support for the development of XML based systems for authoring and updating legislative data generated throughout the entire legislative process. DSCS worked with both houses of Congress to fine tune the Legislative Exchange DTD, developed streamlined processes for generating complex tabular data, and created a XML legacy dataset by converting several sessions of Congress from GPO locator code.

Working in collaboration with the Office of the Clerk of the House, DSCS supported the migration from a DOS-based environment to an XML document creation and publishing system. DSCS will demonstrate applications of this work that have led to increased efficiency and productivity for the users. The move to XML for legislative document creation has not only reduced the costs associated with these processes but perhaps more importantly, has allowed the congressional staff to focus on their real mission of drafting and editing legislation rather than acting as typesetters for an antiquated publishing system.

DSCS worked with the US House and Senate on what many believe is one of the most complex DTD ever developed. See xml.house.gov/#dtd. The many business rules and requirements stemming from the way legislative data is created and managed, as well as complex legal and regulatory requirements had to be observed in the design of the data structure and corresponding DTDs. To provide a baseline of properly tagged XML data, the entire proceedings of the 106th and 107th Congress were converted from a legacy publishing format (GPO locator code, aka bellcode) to XML. Over 99.% of all of the data was successfully converted to conform to the Legislative Exchange DTD, a testament to DSCS's programmatic conversion methods.

In contrast to traditional methods, DSCS's systematic approach employing a layered, iterative process was able to minimize human intervention. The initial development costs were slightly higher but were quickly amortized over the complete project. The knowledge base created during this project was captured by the system rather than individual analysts allowing experience from the initial projects to be institutionalized and brought to bear on subsequent efforts. The knowledge base provided dramatic cost savings on the follow on conversion of data from the 108th Congress.

DSCS's approach, though unique to the data conversion sector, is a methodology that industry has applied to almost ever other facet of the computer world. This type of systematized process has been shown to provide results that are uniformed and predictable. DSCS's methodology supports continuous improvement both within and across projects.

2. Tabular Data Creation Tool

The US House moved quickly to use their new XML publishing system and the converted legacy data in order to generate nearly 90% of the introduced legislation in 2003. The staff, however, was challenged to find an XML solution that would allow them to create tabular data that could be published using GPO's proprietary print engine, MicroComp. DSCS was contracted by the US House to build the tools necessary for creating this tabular data. A multiphase approach was undertaken to accomplish the task. DSCS content experts interviewed several key stakeholders and documented the current process to create specific tables, such as tax rates and military pay scales. The tool had to be easy to use for those new to XML but also provide advanced features for more experienced users. The tool had to create a table that was valid in XML and successful convert to GPO locator code. The solution DSCS created was based on the use of templates and it calls a DLL from the XML application to create a table in XML, fix any problems that would result in a incorrectly tagged table in locator code, conver the XML to locator code and render in PDF using MicroComp.

A document must still be converted to locator code so that it can be published by the Government Printing Office. The CALS (OASIS) table model was used for the creation of tables in XML, however, it was enhanced so that additional information, such as leadering, used by MicroComp could be capture. This new table model is referred to as the CALS+ model.

A GUI was built that allows a user with no knowledge of XML to create a valid table. Advanced features allow the experienced XML user to view tags and create new templates. The tool will be demonstrated as it currently is being used by the Legislative Branch, and discuss how it was deployed in phases to accommodate the evolving education of the users. Recently, the House wanted to display bills and resolutions on the Library of Congress web site in a searchable format. Utilizing the power of the XML behind the documents, DSCS created an XSLT for the display of the tables, see http://thomas.loc.gov.

3. GPO Table Tool

Following the success of the Tabular Data Creation Tool for the Legislative Branch, the Office of Prevention, Pesticides and Toxic Substances of the Environmental Protection Agency recognized the need for a similar tool for the regulatory agencies in order to publish documents in GPO's SGML/XML format for the Federal Register. DSCS was contracted to develop an extremely powerful, but generic, table tool, that could be integrated into almost any XML authoring software application. The detailed design requirements and a demonstration can be found at our web site www.dscs.com/govt_services/XML_2004_conference.

One of the most exciting aspects of the DSCS Table Tool is its applicability to multiple source documents including the Federal Register, and other legislative and regulatory content. The tool allows data to be converted back and forth between CALS+, SGML/XML and locator code. The tool has been integrated into a customized XMetaL application, and was also used to convert CALS+ tables that exist in a XML version of the United States Code used by the House Office of Legislative Council and the Senate Legislative Council. You will find a demo of this tool by following the links at our web site www.dscs.com/govt_services/XML_2004_conference, nearer to the beginning of the XML 2004 conference dates.

4. HOLC Outline Tool - Amendment Instructions

Working with the House Office of Legislative Counsel (HOLC) on a separate project, DSCS documented the manual processes used by HOLC staff to create a document known as an Outline which details legislative changes resulting from a proposed bill. Historically, depending on the size of the bill, it could take a couple people a few days to generate one Outline. DSCS developed software replicating this process, resulting in a solution that creates an Outline, in XML, within seconds. This type of process improvement returns savings to the client, year after year. DSCS was conscious of the need to migrate the users from their legacy process to full XML authoring, at a non-threatening pace while insuring uninterrupted workflow from this critical group. The success of the initial HOLC project has led to several follow on efforts including development of 'cut and paste' functionality across multiple source documents within the XML environment as well as additional specialized document creation in XML.

5. Conclusion

DSCS has been successful in providing service based solutions to some of the Federal Governments most difficult data challenges. In this presentation we have demonstrated some of the work that has led to process improvement as well as time and money savings for several government organizations. Thoughtful application of XML data structures and tools, which took into account both the technical and human aspects of a project, have led to significant improvements in process efficiency. DSCS's systematic approach to data structure design and legacy data conversion has provided predictable results and institutionalized the knowledge base. Application of the knowledge base over multiple projects spanning several years has demonstrated reductions in total life-cycle cost.

XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.