How the US Federal Government is Using XML: One Year Later

Keywords: E-Government, Web Services, Registry, Repository, Data interchange, Enterprise applications, W3C XML Schema, ebXML, UBL, Metadata, Interoperability, Legacy Data Conversion, Business process, Change management, Semantic Web, Core Components, Naming and Design Rules

Kenneth Sall
XML Specialist
SiloSmashers
Vienna
Virginia
United States of America
KSall@SiloSmashers.com

Biography

Kenneth Sall of SiloSmashers, Inc. served as XML consultant on the GSA Integrated Acquisition Environment E-Government Initiative. Ken has been closely tracking XML developments since July 1997 and has developed several XML training courses and numerous presentations for the XML Working Group, NASA and other government agencies. He created the XML section of Web Developers Virtual Library (WDVL.Internet.com). Addison-Wesley published Ken's book, XML Family of Specifications: A Practical Guide in June 2002. His personal Web site contains several useful XML resources including his unique Big Picture of the XML Family of Specifications, an imagemap gateway to all major XML technical specifications which also indicates their maturity and depicts interrelationships. Ken has been an active participant in the Federal CIO Council's XML Working Group (XML.gov), the Semantic (XML Web Services) Interoperability Community of Practice (http://web-services.gov), and other US Federal XML efforts.


Abstract


XML is integral to US Federal Enterprise Architecture, according to OMB and the E-Government Act. This paper provides an updated, high-level view of over a dozen mature XML efforts in the US Federal Government, including three in-depth case studies of significant XML initiatives by the Internal Revenue Service, Global Justice, and the General Services Administration. Topics include legislative motivators for the use of XML, the Federal Enterprise Architecture and the forthcoming Data and Information Reference Model, working groups, communities of practice, three E-Government Initiatives, registries and repositories, thirteen specialized applications, including efforts from NASA, GSA, IRS, the Department of Navy, and the Department of Education. A major portion of one case study concerns a variety of sources of XML guidance at both the federal and international level.


Table of Contents


1. Introduction
2. Legislative Motivators for XML
3. Federal Enterprise Architecture and the DRM
4. Communities of Practice, Working Groups and Federal XML Guidance
     4.1 XML Working Group / XML Community of Practice
     4.2 Web Services Working Group / Semantic Interoperability Community of Practice
     4.3 Department of Navy XML Work Group and the XML Naming and Design Rules
     4.4 NASA XML Working Group
     4.5 Intelligence Community Metadata Working Group
     4.6 U.S. Federal and International XML Guidance Sources
5. E-Government Initiatives
     5.1 Business Gateway: SBA and GSA
     5.2 ETravel Service: GSA
     5.3 Integrated Acquisitions Environment: GSA
6. Registries and Repositories
     6.1 CORE.gov - GSA's Component Organization Registry Environment
     6.2 DoD Metadata Registry and DoD XML Gallery
7. Specialized Applications: Tables of XML Efforts
     7.1 Component Organization and Registration Environment
     7.2 Emerging Technology Life-Cycle Management Process
     7.3 Department of Education's Federal Student Aid XML Framework
     7.4 GSA Integrated Acquisition Environment
     7.5 GSA Project and Program Information Exchange
     7.6 IRS Modernized e-File
     7.7 IRS XML Bulk Load
     7.8 IRS SCORM Learning Registry Solution
     7.9 Global Justice XML Data Model and Global Justice Data Dictionary
     7.10 NASA XML Project and NASA XML Working Group
     7.11 NASA Launch Vehicle Language
     7.12 NASA Materials and Processes Technical Information System II
     7.13 DON Automated Manuals and Interactive Electronic Technical Manuals
8. Case Studies
     8.1 Applying Federal XML Guidance to GSA's Integrated Acquisition Environment
          8.1.1 Background
          8.1.2 High Level Conceptual (To-Be) Architecture
          8.1.3 Standard Transactions Vocabulary and Information Exchanges Development
          8.1.4 Guidance Influencing IAE Vocabulary and XML Schema Development
          8.1.5 IAE Summary XML Guidance
          8.1.6 ISO 11179 and CCTS Terminology
          8.1.7 Data Modeling Process
          8.1.8 Status and Lessons Learned
     8.2 IRS Modernized e-File for Corporate and Exempt Organization Returns
          8.2.1 Background
          8.2.2 Requirements and Releases
          8.2.3 Transmission File Structure, Return Data and Binary Attachments
          8.2.4 XML Schema and XML-Related Rules
          8.2.5 Business Rules and Version Control
          8.2.6 Data Validation and Error Structure
          8.2.7 Sample MeF Code
          8.2.8 IRS MeF-Related References
     8.3 Global Justice XML Data Model and Global Justice Data Dictionary
          8.3.1 Background
          8.3.2 Developer's Workshop
          8.3.3 GJXDM 3.0 Model and Content
          8.3.4 Types and Properties
          8.3.5 SuperType and Inheritance
          8.3.6 Code Lists and Namespaces
          8.3.7 Relationships and Referencing
          8.3.8 Customized Schema Subsets
          8.3.9 Tools for Use with GJXDM
          8.3.10 GJXDM XML Schema Example: Amber Alert
          8.3.11 Guidelines and Lessons Learned
          8.3.12 GJXDM References
9. Summary and Conclusions
Acknowledgements
Bibliography
Footnotes

1. Introduction

The US Federal government has embraced XML as a key technology component of its emerging Federal Enterprise Architecture (FEA). XML and Web services play an important role in the Service Component Reference Model, the Technology Reference Model, and the Data and Information Reference Model of the FEA. XML technology is a central focus of the Emerging Technology group of the Federal CIO Council Architecture and Infrastructure Committee. H.R. 2458, the E-Government Act of 2002, recommends the use of "standards and guidelines for interconnectivity and interoperability"; XML is explicitly called for in the act. The Office of Management and Budget is actively encouraging agencies to incorporate XML into their e-Gov solutions, especially for many of the Presidental e-Gov Initiatives. These Initiatives are being encouraged to develop and deposit their XML Schema in a Federal registry such as CORE.gov and the DoD XML Registry. When new development or re-development is pursued, XML is expected to be considered for use as the default format for highly structured data as well as for semi-structured information. For legacy repositories that do not directly support XML, legacy to XML mapping and data transformation are called for to foster interoperability.

This paper provides an update to the XML 2003 paper of a similar title; see How the US Federal Government is Using XML: An Overview of Selected US Federal Agency Efforts, [HUSFGUX]. In reports published in April and August 2002, the US Government Accountability Office (GAO) [1] made a number of recommendations concerning XML, such as to:

NOTE: Since this paper is (almost) exclusively about the US Federal Government's use of XML, any unqualified references to the word "federal" or "government" should be assumed to mean US Federal Government. However, the views expressed in this paper are those of the author, as was the selection of topics, and should not be construed to be official positions of the US Federal Government or any of its agencies. Information was gathered from agency Web sites, public presentations, phone interviews, and solicited emailed input. Agency representatives or contractors who wish to correct or update details about efforts presented in this paper are encouraged to contact the author directly at KSall@SiloSmashers.com with XML 2004 Update in the subject line. Agencies whose XML efforts have not been represented in this paper who wish to be included in subsequent reports should likewise contact the author.

2. Legislative Motivators for XML

In the previous paper, several legislative motivators for using XML in government were identified; they are summarized here.

3. Federal Enterprise Architecture and the DRM

The Federal Enterprise Architecture (FEA) was discussed in the previous paper. Five reference models of the FEA were described:

The first eight months of 2004 did not see any significant (public) change in these reference models, although a new Office of Management and Budget (OMB) Enterprise Architecture Assessment Framework was posted. In terms of new XML documents, the FEA Downloads page contains a Revised Exhibit 300 Schema, Version 2.95 (for FY06). This XML Schema describes and defines the type of content used for the submission of OMB Exhibit 300s for the FY06 budget cycle. It is also possible that the XML file FEA Federal Reference Models (BRM,SRM,TRM) Version 1.2: XML Document is also a new version in 2004, although due to the lack of date stamps for the items on this valuable Downloads page, it is difficult to be certain. According to the description of the file, it is an XML rendition of three of the Reference Models: BRM 2.0, SRM 1.0, and TRM 1.1. However, the XML itself does not refer to the minor version number for the TRM.

At the time of this writing (late August/early September 2004), the DRM had not yet been published because FEA Program Management Office (FEAPMO) is awaiting final approval from OMB. Draft versions that have been circulated indicated that the DRM may appear as four separate volumes at different points in time. In one draft, Volume 1 was essentially an overview targeted for managers. According to Karen Evans, OMB Administrator, Office of E-Gov & IT and Director of the CIO Council, DRM Volume 1 is expected to be released in mid-September. The Information Exchange Package, to be described in the DRM, should eventually facilitate inter-agency data mapping and exchange. In the opinion of this writer, it is highly likely that XML and Web services will play a major role in whatever data exchange standards emerges. Also likely is the influence of [ISO 11179] in defining data elements in terms of Object Class, Property Term, and Representation Type. However, since the DRM is a reference model, rather than a mandated government-wide data model, its primary purpose will likely be categorization and classification of information in ways that facilitate exchange between agencies. For the Karen Evans interview about the forthcoming DRM, see the Federal Computer Week article, Reference model on deck. See also [FEA Reference Models].

UPDATE: The DRM was released for agency comment on or about October 20, 2004. Agencies comments are due December 3rd. The DRM is available from the FEAPMO downloads page or directly from http://feapmo.gov/resources/DRM_Volume_1_Version_1_101404_FINAL.pdf.

4. Communities of Practice, Working Groups and Federal XML Guidance

4.1 XML Working Group / XML Community of Practice

Last year, we discussed the Federal CIO Council's XML Working Group, first charted in July 2000. At the time of this writing, the charter was nearing expiration in September 2004. Therefore, in August, the co-chairs, Owen Ambur (Department of the Interior) and Lee Ellis (GSA, Office of Governmentwide Policy), submitted a Charter Amendment and Renewal: XML Community of Practice, Discussion Draft to the CIO Council, Architecture and Infrastructure Committee, Emerging Technology Subcommittee. The intent was to establish the XML Community of Practice (xmlCoP), to be chartered by the Emerging Technology Subcommittee (ET S/C) of the Architecture and Infrastructure Committee (AIC) under the authority granted in the CIO Council's charter "to establish standing committees and working groups as necessary to consider items of concern to the Council." At the time of this writing, it seemed likely that the Emerging Technology Subcommittee would grant the re-casting of the working group as a community of practice.

But what exactly is a community of practice (CoP)? According to Colleen O'Hara in her Federal Computer Week article (August 9, 2004), "A community of practice is a place where a group of people, bound together by a common passion and interest, meet to ask questions, respond to others, and exchange ideas and information." An alternate definition is offered by FederalConnections.org; the site quotes Etienne Wenger, co-author of the CoP term: "A community of practice is a group of people who share an interest in a domain of human endeavor and engage in a process of collective learning that creates bonds between them." Another useful definition of CoP appears in the FederalConnections.org glossary; "A group of individuals with a common working practice who do not, however, constitute a formal work team. Communities of practice generally cut across traditional organizational boundaries and enable individuals to acquire new knowledge otherwise unavailable or at a faster rate." [2]

Regardless of the name of the group or how you define community of practice, this writer is confident the xmlCoP will continue in its four-year tradition of bringing monthly technical presentations of considerable interest to the federal XML community, as well as serving as a key site for locating XML resources, including links to federal XML guidance.

4.2 Web Services Working Group / Semantic Interoperability Community of Practice

However, well before the XML Working Group was re-cast as a CoP, the Web Services Working Group of the CIO Council, founded by Brand Niemann (Environmental Protection Agency) had already gone that route. His Web Services site is extremely information rich, containing over 1,200 links to presentations, reports, pilot studies, workshops, special events, and other related resources.

The Semantic Interoperability Community of Practice (SICoP) [Wiki page], formerly the Web Services Working Group is organized around the purpose of achieving "semantic interoperability" and "semantic data integration" in the government sector. SICoP's goal to enable Semantic Interoperability, includes specifically the "operationalizing" of these technologies.

According to FederalConnections.org, SICoP is a special interest group that operates under the umbrella of the Knowledge Management Working Group (KMWG) sponsored by the Best Practices Committee of the Federal Chief Information Officers Council (CIOC). It collaborates interdependently with other forums including the XML Working Group and the Open International ONTOLOG Forum.

The SICop Web site lists a number of Networked Communities of Practice and Dynamic Knowledge Repositories, providing a gateway to overlapping and related efforts such as Emerging Technology Components Marketplace for eGovernment Web site by Maurice Swinton (SBA), Brand Niemann (EPA), Susan Turnbull (GSA), Tony Stanco (George Washington University), et al, and the Collaboration Expedition Workshops organized and hosted for over three years by Susan Turnbull (GSA), with assistance from Brand Niemann and others. The latter is sponsored by the Emerging Technology Subcommittee, AIC and the National Coordination Office for Information Technology Research & Development; there is also an associated Expedition Workshops Wiki site.

4.3 Department of Navy XML Work Group and the XML Naming and Design Rules

According to an article in Government Computer News, [3] the federal government has based its draft XML Developers Guide on a similar 2002 document from the Department of the Navy (DON) XML Work Group. These two guides were useful, but they left many design and implementation choices up to developers. A new set of far more prescriptive Navy guidelines is due for release sometime in the near future; most likely, it will be called the DON XML Naming and Design Rules (NDR).[4]

Voluntary Consensus Standards from the W3C, ISO, UN/CEFACT, and OASIS UBL were influential in developing the DON NDR. In an attempt to clarify exactly which standards are supported by the DON XML Work Group, the following list of DON approved documents was posted as an attachment to a message on the XML Working Group listserv (4 August 2004). This list (or one much like it) is expected to appear as an appendix to to the DON dcoument. The poster also indicated that OWL, RDF, and DOM Level 3 were under final review by the DON XML Work Group and probably would be added to the list shortly.

W3C Specifications

OASIS Specifications

ISO Standards

UN/CEFACT Standards

In addition, Simple API for XML (SAX) v1.0 & v2.0 is listed in a category of its own.

It is certainly no coincidence that the DON XML Work Group, an OASIS UBL subcommittee on Naming and Design Rules (no longer operational), and the UN/CEFACT Applied Technologies Group (ATG) are each authoring a document that has "Naming and Design Rules" in its title. The UN/CEFACT XML Naming and Design Rules, Draft 1.0 published in August 2004, is the first of these NDRs to be made publicly available. According to the acknowledgements in that document, "The UN/CEFACT - XML Naming and Design Rules were developed in close coordination with other XML standards efforts. In particular, the OASIS Universal Business Language Technical Committee Naming and Design Rules were instrumental in developing this document." Contributions from various other organizations are listed, including the U.S. Department of the Navy, U.S. Environmental Protection Agency, U.S. Federal CIO Council XML Working Group, among others.

NOTE: At the time of this writing, the agenda for the 22 September 2004 XML Working Group (or Community of Practice) indicated that there would be a discussion of the DON XML Naming and Design Rules (NDR) and Related Standards, and tentatively lessons learned in applying the DON NDR, and, perhaps very significantly, establishment of DON NDR as Governmentwide good practices through the CIOC/AIC. Also a UBL version of Naming and Design Rules is rumored to be in circulation. See XML.gov presentations for 22 September 2004 to follow-up this topic. A Google search for NDR many also prove helpful. There is also an interesting Danish approach presented at XML 2003.

4.4 NASA XML Working Group

The NASA XML Project home page contains a business case and project plan, as well as NASA's motivation for providing guidance concerning the use of XML for its many research and flight centers. The author of the business plan clearly states motivation and recommendation:

NASA is primarily an information-centric agency. The end result of virtually all the activities that NASA performs in support of its missions is information that is shared with the American public, with our university and private industry partners, and with the international community. In order to be most successful, NASA must use management approaches, processes, and technologies that allow us to generate, transmit, and utilize information effectively and efficiently. This business case details why NASA should invest in a set of activities that will advance Agency-wide the appropriate use of a key information handling technology known as the eXtensible Markup Language (XML)....[Recommendation:] NASA should commit to XML as a strategic technology and make the investments recommended in this business case to ensure that the full benefits of XML are realized by the Agency. The Agency's XML efforts should be led by the NASA Office of the CIO to ensure implementation of XML capabilities that are consistent with both the NASA Enterprise Architecture and the Federal Enterprise Architecture. By taking these steps, NASA will improve the interoperability of its information systems and increase information reuse, thereby lowering costs and reducing the time needed to deliver new capabilities to its programs. NASA will also comply with the OMB requirement to align with the Federal Enterprise Architecture and will be directly responsive to the requirements of the President's Management Agenda.

— Bob Benedict, NASA XML Business Case, Version 1.0, June 11, 2003

Although the NASA XML Working Group wasn't formed until 2003, XML was actively used at various NASA centers soon after it became a W3C Recommendation. This author knows from first-hand experience that NASA began XML development at least as early as 1998; we used XML for a Code 588 project called Instrument Remote Control at NASA Goddard Space Flight Center. We developed an early DTD-based language called Instrument Control Markup Language (ICML), also known as Astronomical Instrument Markup Language [Cover pages] (AIML) [5]

NASA XML efforts include, but are not limited to, NASA Image eXchange (NIX), the NASA Portal (MyNASA), Space Research (Code U), HQeD x500 Directory Service, Global Change Master Directory Colon to XML Converter, NASA Launch Vehicle Language and the NASA Materials and Processes Technical Information System II. See especially the NASA XML Project and NASA XML Working Group table in this paper.

4.5 Intelligence Community Metadata Working Group

The U.S. Intelligence Community Metadata Working Group hosts a site that collects information about the metadata standards activities within the Intelligence Community (IC) for the purposes of data interoperability and information discovery.

The IC Metadata Working Group (ICMWG) sponsored a tool investigation effort, executed by the Defense Intelligence Agency (DIA) Requirements, Research & Analysis ( RRA) Division. This project is known by several names, such as Retrospective Metadata Tagging (RTM) , the IC XML Tools & Technologies Consumer's Report or simply the IC XML Consumer's Report. Goals of the project were to conduct market research in the XML and metadata software tools space and to report findings to the ICMWG.

Among other things, the research team collected information concerning more than 170 XML-capable tools and evaluated 12 software tools using unique, cooperative approaches with software vendors. Tool demonstration videos and benefits videos are available to the intelligence community. The former illustrate actual IC metadata and XML standards solving real problems and are a "powerful testimony of where the commercial market is going regarding support of XML. " In the benefits videos, vendors provide their opinions about the state of the market, describe approaches to solving critical problems, discuss the value of metadata and XML, and provide examples of approaches that work and don't work, as well as recommendations for the IC.

NOTE: You must be either a government employee or an IC approved contractor to access nearly all material on the ICMWG site, including the tools report.

4.6 U.S. Federal and International XML Guidance Sources

While there is no U.S. government-wide XML policy (as of this writing), for all the major relevant sources of XML guidance known to this author, see Section 8.1.4. Note that the majority of the guidance documents presented in Table 14 are the result of international collaboration; they are not US-specific.

For those interested in specific guidance from other countries, see the XML.org Focus Area on e-Government (which lists efforts from the United Kingdom, Hong Kong [including Common Schema such as ISO-3166 Country Codes], Germany, New Zealand, and possibly others) and the OASIS e-Government Technical Committee. The work by Denmark including their XML Naming and Design Rules is particularly interesting, as is the extremely well-documented Hong Kong effort.

5. E-Government Initiatives

Last year, we discussed how 24 (or 25) E-Government Initiatives were proposed in late 2001 and early 2002 as a key component of the President's Management Agenda to make the government more results-oriented, efficient and citizen- centered. These initiatives are grouped into 4 portfolios and one cross-cutting security initiative:

In this section, we present updates on three of the initiatives: Integrated Acquisitions Environment and ETravel, in the Internal Efficiency and Effectiveness portfolio, and the Business Gateway, in the Government to Business portfolio.

5.1 Business Gateway: SBA and GSA

Last year, we reported that the Business Gateway Initiative was preparing to focus on E- Forms. Small Business Administration (SBA) in partnership with the General Services Administration (GSA) plan to create an "E-Forms gateway" for federal forms systems. The initiative has already created a Business Portal ( www.business.gov) to provide information and services to businesses, including a link to a consolidated Government-wide forms catalog ( www.forms.gov), presently containing over 3,000 forms from 18 agencies and government corporations. Perhaps as many as 4,000 forms are expected to be cataloged by the end of 2004. However, the vast majority of the current catalog cannot be electronically signed or electronically submitted, and many cannot even be filled out electronically.

A recent Request For Information (RFI) posted on FedBizOpps entitled "Seeking Information About E-Forms Processing Systems And Services" indicated that SBA and GSA were seeking information regarding industry's capability and interest to implement government-to-citizen and government-to-business electronic, online, XML-enabled forms processing solutions on a large scale. The RFI stated

Our vision is to reduce the amount of effort required to find, fill out and submit forms, to reduce the burden on the nation' businesses by simplifying, and improving the submission of information to Federal Government for programs, and services. [The Business Gateway] will provide businesses and citizens with a one-stop means to find, download, fill, sign (e-authentication), and submit forms electronically to the Government....By providing a cost-effective, customer-centered way to access business-related Government information, services, and forms, the Government will reduce the regulatory paperwork burden on citizens and business owners; reduce agency costs; and increase value by simplifying and unifying data, forms, and streamlining processes and optimizing infrastructure.

Although an RFI does not necessarily indicate the actual requirements that will eventually appear in the Request For Proposal (RFP), the 26 capabilities and scenarios include: cross-platform deployment, error checking, mandatory fields, contextual help, form completion in stages, electronic signature, encryption, saving user profile information on the user's PC, pre-populating fields based on the profile, and much more. It is not surprising, therefore, that the RFI calls for a solution that complies with the emerging XForms standard, the so-called Next Generation of Web Forms from the W3C. See [Biz Gtwy].

5.2 ETravel Service: GSA

In fiscal year 2001, federal agencies processed roughly 2.4 million employee travel vouchers totaling over $9 billion. GSA's eTravel Initiative "[p]rovides a government-wide web-based service that applies world-class travel management practices to consolidate federal travel, minimize cost and produce superior customer satisfaction." The commercially hosted eTravel Service (eTS) offers functionality from travel planning and authorization to reimbursement, and leverages "administrative, financial and information technology best practices to realize significant cost savings and improved employee productivity." Three 10-year contracts worth $450 million combined are expected to cut federal travel management costs by as much as 50 percent. On 23 December 2003, an amendment to the Federal Travel Regulations (FTR) required agencies to complete adoption of eTS by September 2006. There are three Web sites as the result of a multiple vendor award:

The data modeling effort for ETravel [6] began in January 2003 with 40 minimally defined elements in the Federal Travel Regulations (see "Appendix C to Chapter 301: Standard Data Elements for Federal Travel"). As we were writing the RFP, the decision was made to invest a few months in developing a better picture of the data elements currently used across agencies, since to construct a government-wide service, data normalization and harmonization would be necessary. Our focus was on U.S. Federal travel, not on international harmonization, however. The eTravel PMO conducted an informal data call in February 2003 to which nine of ten agencies responded. Elements identified by the various regulations and by agency subject matter experts were grouped into nine major categories such as traveler profile, travel authorization and planning, reservation and ticketing, and travel vouchers and claims. Each category was further divided into groups. For example, the reservation and ticketing category was divided into four groups: reservations itinerary-transportation, transportation cost, reservations-lodging, and reservations-vehicle. Each group was further subdivided into specific data elements. For example, the data elements in the transportation cost group were: transportation base price, transportation taxes, transportation fee, travel agency fee, transportation cancellation charges, total transportation amount, transport payment method indicator, and transport payment identification number.

A second data call and workshop drew participation from 17 of the 24 BRM agencies. All data elements were recorded in a spreadsheet with columns for typical data dictionary information such as description, data type, optional or required or dependent, cardinality, and constraints. The complete set of 385 data elements (279 for exchange, 106 for data warehouse) was provided as Government Furnished Information (GFI) to offerors bidding on the RFP in May 2003. This was accompanied by instructions indicating that standard data elements for eTS data exchange should adhere to the names and characteristics identified in the spreadsheet. Data element naming should follow the guidelines of the CIO Council's XML Working Group draft XML Developer's Guide, which strongly recommends the use of either business terms (or names that comply with ISO/IEC 11179). Elements were placed into "groups" (e.g., Official Duty Station) and larger functional "categories" (e.g., Traveler Profile), analogous to XML Schema complex types and root schema, respectively. We then worked with agencies to normalize names. Results were published as GFI via an attachment to the eTravel Service RFP on FedBizOpps: "to provide eTravel Service offerors with details concerning the standard data elements identified to date by the Government as needed across agencies for exchange between eTS and agency business systems. The goal is to create a standard data set for input and output that is common across agencies, making it easier and less costly for agencies to integrate with the eTS."

Eventually, the three vendors who were awarded the contract delivered XML Schema based on the GFI. An evaluation form was created to assess XML Schema submissions from each vendor against criteria that were based on federal guidance. Roughly 50 criteria were divided into eight categories:

For each of the 50 criteria, each vendor's submission was graded as either pass or fail with an explanation. In one case, a vendor's second schema submission improved from failing half of the criteria to passing all but a few.

At the time of this writing, there was renewed interest in possibly aligning with travel XML Schemas developed by the non-profit group called OpenTravel Alliance. See the [eTravel Service].

5.3 Integrated Acquisitions Environment: GSA

The Integrated Acquisitions Environment initiative was briefly covered last year. This year, we have provided a more in-depth case study. See the Integrated Acquisition Environment Case Study.

6. Registries and Repositories

6.1 CORE.gov - GSA's Component Organization Registry Environment

The Component Organization Registry Environment (CORE) located at CORE.gov provides a collaborative environment that encourages consistent use and reuse of business processes and FEA components within and across Federal Agencies. In this context, CORE.gov fosters both the program management as well as the technical detail-oriented work of software or component development by establishing a collaborative environment to work in while allowing program managers the ability to select what they wish to share and determine how much collaboration they need. Additionally, CORE.gov facilitates the integration of business patterns and processes across organizational structures while offering a third-level Internet domain to work in at no cost to the agency or program. CORE.gov is the only repository supporting the government-wide scope of the FEA with a cross-agency perspective and approach to the management and reuse of common components amongst all Federal Agencies and their partners.

CORE.gov was established in December 2003. The goals of CORE.gov include:

CORE.gov permits eGov applications to flourish and provide data that is easily shared and reusable. This registry environment also reduces operating costs, and implements and sustains internal business processes that optimize reliable, timely, and quality service. As of August 2004, there were 1,500 registered users of the site, with 670 of these accessing the site at least twice per month. A partial list of the roughly 70 registered components as of August 2004 follows. In some cases, the component name is the same as the agency who registered it. It does not appear to be possible to determine at this time how many of these registered components uses XML, although it is likely that many do.

Over time and based on the crystallization of evolving requirements, it is anticipated that CORE.gov will be enhanced to better meet the needs of its customers by federating to other registries and adding automated tools that can be shared government-wide, pending availability of funds. GSA's current plans for CORE.gov support are to:

The CORE.gov project team of Annie Barr, Lee Ellis and Marion Royal (GSA Office of Governmentwide Policy) received the CIO Council's first group award at IRMCO in September 2004 for establishing an enterprise architecture component registry to increase the adoption of common business practices.

For contact information, see Component Organization and Registration Environment.

6.2 DoD Metadata Registry and DoD XML Gallery

The DoD Metadata Registry has been operational since May 1999 and had over 3,500 registered users as of August 2004. This is very much a bottom-up approach to registries. The intention is to foster visibility and re-use, rather than imposing mandates for standardization.

The chief architectural components of the current version (4.1 in 4Q2004) are:

Version 5.0 is in development for 2005 to consolidate COTS products, make better use of BEA for run-time services, and to improve reliability by clustering, among other things. New features will include OWL encoded format for the Taxonomy Gallery, granular access rights, end user customization of their view of the portal, web services for retrieval across all information sources, and more.

NASA has been actively registering their resources with the DoD XML Registry. While simple projects were easy to register, more complex projects presented obstacles. For example, multiple complex and interdependent XML Schemas caused problems that are being addressed. However, NASA is committed to working with DISA to enhance the registry.

NOTE: Due to DoD Operations Security changes, a perspective user must first establish a government account or government-sponsored account to access the majority of the features of the Metadata Registry (which includes the XML Registry). After registration, a user may browse, search, substring search, and retrieve data. See [DoD Registry].

7. Specialized Applications: Tables of XML Efforts

This section includes a separate table with descriptions and contact information for each of the 13 XML efforts that were submitted to this writer, based upon an informal (unofficial) call for input announced on the GSA-hosted listserv of the XML Working Group of the CIO Council in July and August 2004. Although contributions were received from a limited number of federal agencies, readers should not assume this reflects the only agencies using XML. Other agencies wishing to announce their efforts in subsequent reports may contact the author, providing information that is analogous to that shown in these tables. Any errors introduced by this writer in re-working submitted information are not the fault of the contributers, listed in the Acknowledgements.

  1. Component Organization and Registration Environment
  2. Emerging Technology Life-Cycle Management Process
  3. Federal Student Aid XML Framework
  4. GSA Integrated Acquisition Environment
  5. GSA Project and Program Information Exchange
  6. IRS Modernized e-File
  7. IRS XML Bulk Load
  8. IRS SCORM Learning Registry Solution
  9. Global Justice XML Data Model and Global Justice Data Dictionary
  10. NASA XML Project and NASA XML Working Group
  11. NASA Launch Vehicle Language
  12. NASA Materials and Processes Technical Information System II
  13. DON Automated Manuals and Interactive Electronic Technical Manuals

7.1 Component Organization and Registration Environment

Component Organization and Registration Environment
Acronym:CORE
URL:https://www.core.gov/
Year Effort Began:2003
Point of Contact:Lee Ellis
Contact E-mail:Lee.Ellis@gsa.gov
Major Participants:GSA, CollabNet, SAIC
Community of Interest:Government Project Mangers and Program Managers; US Federal agencies and also state and local agencies.
How to get involved:Follow the Contacting CORE.GOV link
Submitted by:Lee Ellis, GSA

Table 1

Description:

CORE.gov, the Component Organization and Registration Environment, is the government source for business process and technical components. The site provides a collaborative environment that encourages consistent use and reuse of business processes and Federal Enterprise Architecture (FEA) components within and across Federal Agencies. CORE.gov is the place to search for and locate a specific component, or to find components that can customized to meet unique requirements. Components can also be recommended for inclusion in CORE.gov. See Section 6.1 for more details.

7.2 Emerging Technology Life-Cycle Management Process

Emerging Technology Life-Cycle Management Process
Acronym:ET.gov
URL:http://xml.gov/cop.asp#et
Year Effort Began:2003
Point of Contact:Owen Ambur
Contact E-mail:Owen_Ambur@ios.doi.gov
Major Participants:CIO Council, Emerging Technology Subcommittee, XML Working Group, Industry Advisory Council (IAC), Booz Allen Hamilton
Community of Interest:Any community of practice (CoP) that chooses to form around any emerging technology component for potential use by any government agency.
How to get involved:When the ET.gov site is operational, register proposed ET components and/or subscribe to components proposed by others, thereby self-identying yourself as part of the CoP to move the proposed component to higher levels of maturity in the ET process.
Submitted by:Owen Ambur, Co-Chair, XML Working Group

Table 2

Description:

The U.S. federal CIO Council's Emerging Technologies Subcommittee has been charged with developing a process whereby the information technology innovation life-cycle can be managed on a Governmentwide basis. The driving force is the inability of agency chief information officers (CIOs) to respond effectively to myriad vendors and other proponents of technology components, particularly those that are new, innovative, and perhaps untested and unproven in practical application. The expectation is that the process will help to structure such input for better coordinated and more productive response, in support of the eGov initiatives and within the framework of the Federal Enterprise Architecture (FEA). The desired outcome is the well-coordinated acquisition of logically separable technology components for potential Governmentwise usage.

7.3 Department of Education's Federal Student Aid XML Framework

Department of Education's Federal Student Aid XML Framework
Acronym:FSA XML Framework
URL:http://fsaxmlregistry.ed.gov
Year Effort Began:2003
Point of Contact:Holly Hyland
Contact E-mail:holly.hyland@ed.gov
Major Participants:Education's Office of Federal Student Aid (FSA), Postsecondary Electronic Standards Council (PESC). PESC has members across the education community, including schools, sevicers, and vendors.
Community of Interest:n/a
How to get involved:Members of the education community can get involved with PESC by contacting them, information is on their website, http://www.pesc.org.
Submitted by:Andrew Smalera, Accenture, working with Federal Student Aid (FSA) U.S. Department of Education

Table 3

Description:

The FSA XML Framework has 2 major areas: the XML Core Component Dictionaries and the XML Registry and Repository. The XML Core Components are reusable pieces of business information that provide a standard definition for key data entities across FSA's enterprise and the Education Standards Community. These standard definitions will enable FSA to better access and compare data between its systems for mapping and analysis. Additionally, by referencing and exchanging data in a common format, the common set of data definitions and XML modeling will help FSA improve data quality and integration services between systems. These definitions will also provide the starting point for defining a common Education Community data dictionary that can be used to help ensure consistent definitions and simplify data exchange between trading partners. The FSA XML Registry and Repository provides a central access point for FSA's XML Core Components, XML Schemas, and supporting documentation. The XML Registry and Repository provides FSA and the Education Community with a common set of enterprise data definitions that can be used to exchange data between FSA and other systems. Users can access the Registry and Repository to search, view, upload, and download the XML Core Component definitions and documentation. Note: The FSA XML Registry and Repository site has a planned public launch date of September 2004. It will become a central access point for XML Core Components, XML Schemas, and supporting documentation for the Education Standards Community.

7.4 GSA Integrated Acquisition Environment

GSA Integrated Acquisition Environment
Acronym:IAE
URL: Visit egov.gsa.gov and click "Integrated Acquisition Environment"
Year Effort Began:2002
Point of Contact:Teresa Sorrenti (Program Manager) or Earl Warrington (Deputy Program Manager)
Contact E-mail:teresa.sorrenti@gsa.gov or earl.warrington@gsa.gov or integrated.Aquisition@gsa.gov
Major Participants:All agencies subject to the Chief Financial Officers Act of 1990 [7]
Community of Interest:Federal acquisition community
How to get involved:Phone Teresa Sorrenti (703-872-8610) or Earl Warrington (703-872-8609)
Submitted by:Lisa Cliff, SiloSmashers

Table 4

Description:

The Integrated Acquisition Environment creates a secure business environment that will facilitate and support cost-effective acquisition of goods and services by agencies, while eliminating inefficiencies in the current acquisition environment. For in-depth coverage, see the Integrated Acquisition Environment Case Study.

7.5 GSA Project and Program Information Exchange

GSA Project and Program Information Exchange
Acronym:xPPRL and OPX (extensible Project and Program Reporting Language and Open Project Exchange file format)
URL:n/a
Year Effort Began:2001
Point of Contact:Stephen R Hagan FAIA, Thomas Graves
Contact E-mail:stephen.hagan@gsa.gov, thomas.graves@gsa.gov
Major Participants:Currently GSA PBS only, but interest expressed by National Institute of Building Sciences (NIBS), International Alliance for Interoperability (IAI), FIATECH, Federal Facilities Council (FFC), and 26 member agencies responsible for capital construction and capital asset management for both DOD and Civilian portions of the Public Sector Construction Users Roundtable, Stanford CIFE.
Community of Interest:The U.S. Design and Construction Industry, owner's focused on capital construction and life cycle real asset management, both public sector as well as the federal marketplace.
How to get involved:Email stephen.hagan@gsa.gov
Submitted by:Stephen R Hagan, GSA Public Buildings Service (PBS)

Table 5

Description:

The Public Buildings Service (PBS) Project Information Portal (PIP) has been the enterprise system for communicating the PBS capital construction (totaling over $10 billion and nearly 200 projects), to GSA executives, project managers, and customers since 2001. In addition to an information portal, it is also an integration platform, accessing information from numerous internal sources, and integrating with other data and applications. XML data exchange is currently underway between the PIP and the US Army Corps of Engineers (USACE) Civil Engineering Research Laboratory and its application DrChecks. Other efforts for XML interchange are planned, in development, or actively underway. Most importantly to this XML effort, however, is that the information architecture and schema developed in the PIP user interface and navigation, represents what could be considered the default XML for an owner's view of capital construction. This includes major focus areas of: Basics, Scope, Schedule, Financial, Condition, Team Partners, Gallery, Documents and Customers. The navigation also provides rollups of information for program-wide and geo-graphic wide views (in this case GSA regions, but in other cases could be departmental, topical, or other geographic). Consensus building about the major focus areas, portions of this schema, nomenclature, and taxonomy will dramatically improve the opportunity for applications, individuals, organizations, agencies, and even outside team members to communicate and exchange need, relevant and accurate information. The recently released NIST study on the cost of lack of interoperability in the US Construction industry (estimated at over $13 billion annually), provides further stimulus for this standards development and consensus building effort.

7.6 IRS Modernized e-File

IRS Modernized e-File
Acronym:MeF
URL:http://www.irs.gov/efile/article/0,,id=103797,00.html
Year Effort Began:2002
Point of Contact:Joan Barr
Contact E-mail:joan.barr@irs.gov
Major Participants:IRS, States and commercial partners
Community of Interest:Enterprise wide (corporations, small businesses, tax-exempt organizations, and political organizations)
How to get involved:Send request to Sol Safran, sol.safran@irs.gov
Submitted by:Sol Safran

Table 6

Description:

Modernized e-File (MeF), a scalable new system for processing corporate and exempt organization returns, was developed through unique partnership with software developers and tax practitioners. The XML format for returns and SOAP transmissions utilize industry standards rather than proprietary formats. MeF provides real-time transaction processing, filing of returns via a secure Internet connection, and the ability to attach third party documents to the return, thereby eliminating all barriers voiced by industry that previously inhibited the growth of electronic filing. For in-depth coverage, see the IRS Modernized e-File Case Study.

7.7 IRS XML Bulk Load

IRS XML Bulk Load
Acronym:n/a
URL:http://www.irs.gov/irb, http://www.irs.gov/instructions, http://www.irs.gov/publications, and http://www.irs.gov/irm
Year Effort Began:2003
Point of Contact:Gary B. Snyder and David W. Heiser
Contact E-mail:Gary.B.Snyder@irs.gov and David.W.Heiser@irs.gov
Major Participants:IRS Electronic Tax Administration, IRS Media and Publications, Accenture, Management System Designs, Inc., Mitre, Inc.
Community of Interest:U.S. taxpayers
How to get involved:n/a
Submitted by:David W. Heiser, Internal Revenue Service Media and Publications

Table 7

Description:

The goals of the Internal Revenue Service (IRS) XML Bulk Load project are standardization of the annual delivery of more than 20,000 pages of static content (authored using a suite of SGML/XML DTDs) to the public IRS website, as well as to reduce the HTML transformation time from several weeks to several hours. The challenge was to significantly increase the delivery of structured content (i.e., tax form instructions, publications, the Internal Revenue Manual, and the weekly Internal Revenue Bulletin) to the public website while simultaneously streamlining processing, standardizing on presentation and implementing Section 508 accessibility compliance. The stakeholders selected DocBookX (DocBook XML DTD) as the standard delivery format to the preproduction server hosted by Accenture. The various content owners use transformation standards (COM or XSLT tools) to transform the documents from their source SGML or XML format to DocBook 4.2. Accenture transforms the DocBook XML to HTML using the standard XSLT stylesheets available for DocBook. Using the tagging information, the vendor generates a comprehensive hyperlinked index to enhance navigation of these somewhat large documents. Government stakeholders quality reviews the resulting pages before they are published on the production website.

7.8 IRS SCORM Learning Registry Solution

IRS SCORM Learning Registry Solution
Acronym:n/a
URL:http://www.digitalconcepts.com/case_studies.html and http://www.digitalconcepts.com/metasoft_irs/
Year Effort Began:2002
Point of Contact:Claude Mathews
Contact E-mail:claude.a.mathews@irs.gov
Major Participants:Digital Concepts, Inc, IRS Web Services, IRS Human Capital Office, Advanced Distributed Learning (ADL) Co-Lab
Community of Interest:IRS E-Learning and IRS XML Communities, some response from Carnegie Mellon and other learning institutions. Major involvement from the ADL Co-Lab.
How to get involved:Call Claude Mathews at 512-460-7116, visit the Web site, or find the component on https://www.CORE.gov
Submitted by:Claude Mathews, IRS

Table 8

Description:

The IRS Sharable Content Object Reference Model (SCORM) Learning Registry Solution involves meta-data tagging of e-learning content and other knowledge resources to maximize their re-use. Metasoft uses an XML Schema to organize the information, making it highly searchable and discoverable. Objects are located and re-used using an XML manifest imported into the Enterprise Learning Management System. All content is controlled by an overriding Sharable Content Object Reference Model (SCORM) schema. Efforts are underway currently to extend the model to a general data model to organize IRS knowledge content.

7.9 Global Justice XML Data Model and Global Justice Data Dictionary

Global Justice XML Data Model and Global Justice Data Dictionary
Acronym:GJXDM and GJXDD
URL:http://www.it.ojp.gov/topic.jsp?topic_id=43
Year Effort Began:2001
Point of Contact:John S. Morgan; J. Patrick McCreary; Paul Embley
Contact E-mail:john.morgan@usdoj.gov; james.p.mccreary@usdoj.gov; pembley@mstar.net
Major Participants:Joint Task Force on Rap Sheet Standardization, the Regional Information Sharing System (RISS), OASIS LegalXML, American Association of Motor Vehicle Administrators (AAMVA), SEARCH, the National Consortium for Justice Statistics, National Council of State Legislatures, Federal Bureau of Investigation (FBI), Criminal Justice Information Services (CJIS) Division, Integrated Justice Information Systems Working Group (representing 120 companies), Global Justice XML Structure Task Force, National Center for State Courts (NCSC); Law Enforcement Information Technology Standards Council (LEITSC) representing the International Association of Chiefs of Police (IACP), the National Sheriff Association (NSA), the Police Executive Research Forum (PERF) and the National Organization of Black Law Enforcement (NOBLE); Corrections Technology Association (CTA)
Community of Interest:Local, state, federal and international justice practitioners seeking to implement Web services for secure information sharing
How to get involved:Visit the site and/or join the list serv
Submitted by:Patrick McCreary, Office of Justice Programs, U.S. Department of Justice

Table 9

Description:

What began in March 2001 as a reconciliation of data definitions evolved into a broad three-year endeavor to develop an XML-based framework that would enable the entire justice and public safety community to effectively share information at all levels - laying the foundation for local, state, and national justice interoperability. Developed by the Global Justice Information Sharing Advisory Committee and OJP, the GJXDM is an object-oriented data model comprised of a well-defined vocabulary of approximately 2,500 stable data objects, or reusable components, that facilitate the exchange and reuse of information from multiple sources and multiple applications. From the first prerelease version in April 2003 to the current operational release in January 2004, the GJXDM has undergone an intensive review and validation process that included an open public comment period, pilot validation projects, an online feedback and error-reporting mechanism, and a listserv for sharing expertise and support. As a result, the current operational release incorporates more than 100 modifications. Today, more than 50 law enforcement and justice-related projects have been implemented utilizing the GJXDM, further demonstrating the flexibility and stability of the GJXDM. For in-depth coverage, see the Global Justice XML Data Model Case Study.

7.10 NASA XML Project and NASA XML Working Group

NASA XML Project and NASA XML Working Group
Acronym:n/a
URL:http://xml.nasa.gov
Year Effort Began:2003
Point of Contact:Bob Benedict
Contact E-mail:rbenedic@hq.nasa.gov
Major Participants:All NASA centers and JPL.
Community of Interest:Includes managers, developers and users of mission, engineering, and administrative applications.
How to get involved:Contact Bob Benedict. Membership in the NASA XML Working Group is open to NASA employees and contractors.
Submitted by:Bruce Altner, SAIC

Table 10

Description:

The major goal of the NASA XML Project is to advance the productive use of XML technology within NASA. The scope of the Project includes all aspects of XML usage within NASA, including planning/policy, standards, security, and outreach activities. Specific goals include to:

See also Section 4.4 and Section 6.2.

7.11 NASA Launch Vehicle Language

NASA Launch Vehicle Language
Acronym:LVL
URL:http://xml.nasa.gov/xmlwg/presentations/2003/XMLinAEE_XMLWorkingGroup.ppt
Year Effort Began:2001
Point of Contact:Jeremy Vander Kam
Contact E-mail:Jeremy.C.VanderKam@nasa.gov
Major Participants:NASA, Air Force Research Laboratory (AFRL)
Community of Interest:Engineering analysis communities and engineering management communities.
How to get involved:The LVL is in the process of registration with the DISA.
Submitted by:Jeremy Vander Kam

Table 11

Description:

The LVL consists of a set of XML Schemas that define data structures for describing launch vehicle hardware, missions and lifecycle data. Instance documents capture software analysis tool inputs and outputs. Disciplinary analysis tools progressively populate a LVL instance document as an engineering assessment proceeds. Project goals include providing a common set of terms for engineering data, providing a common format for data sharing amongst software tools, and enabling automated data exploration, reporting and transfer. In addition to XML Schema, XSLT Stylesheets, Apache Xerces and Xalan Java modules are used. From fall of 2001 to the present, the major project milestone has been the use of LVL as the primary data transfer mechanism in the NASA Space Launch Initiative (SLI) and Next Generation Launch Technology (NGLT) engineering assessment activites.

7.12 NASA Materials and Processes Technical Information System II

NASA Materials and Processes Technical Information System II
Acronym:MAPTIS-II
URL:http://maptis.nasa.gov
Year Effort Began:2002
Point of Contact:Richard T. Wegrich
Contact E-mail:richard.t.wegrich@nasa.gov
Major Participants:NASA Marshall Space Flight Center (MSFC): ED35, Morgan Research Corporation, Centor Software Corporation, Granta Design, LTD.
Community of Interest:Materials Engineers; Flight hardware designers
How to get involved:n/a
Submitted by:Peter Allison, Materials, Process and Manufacturing Department: Project Engineering Group (ED35)

Table 12

Description:

MAPTIS is a materials information database, originally implemented as an Oracle database hosted on a VAX that was not Web accessible. The goal of the new system, therefore, is to provide a Web accessible, single repository for Materials Properties and Test Information via a COTS database engine (to facilitate migration). Technologies used on MAPTIS include a database is stored in XML format, XSLT, HTML, Javascript, and ASP. XSLT stylesheets are used to render the XML data into HTML with Javascript targeted for Web browsers. The first major milestone was the release of initial Production System on October 1, 2003. It is believed to be the first XML/XSLT based system deployed at MSFC.

The database engine is a COTS package called Materials X-Sight that includes a DTD. Data is stored in the database as XML conforming to the DTD. Interaction with the database is through the X-Sight server which is sent XML-based queries together with a reference to an XSLT stylesheet. The database server applies the stylesheet to the XML data resulting in HTML that is delivered to the Web browsers. XSLT stylesheets are used for data maintenance as well as for display.

7.13 DON Automated Manuals and Interactive Electronic Technical Manuals

DON Automated Manuals and Interactive Electronic Technical Manuals
Acronym:IETM
URL:http://www.dt.navy.mil/tot-shi-sys/tec-inf-sys/etm/ and http://acc.dau.mil/simplify/ev.php?ID=38276_201&ID2=DO_TOPIC and possibly http://navysgml.dt.navy.mil/ietm/ietm.html and http://navycals.dt.navy.mil/mid/ietms.html,
Year Effort Began:2003
Point of Contact:Piper Conrad
Contact E-mail:piperc@wbworldwide.com
Major Participants:Dept. of Navy, Software AG, X-systems
Community of Interest:n/a
How to get involved:Contact Piper Conrad
Submitted by:Piper Conrad, Software AG

Table 13

Description:

Naval Air Systems Command (NAVAIR) recently finished a pilot program that XML-enabled thousands of pages of operation manuals that they were previously unable to be shared while at sea. The Fleet Combat Systems Operational Sequencing System (CSOSS) Development and Implementation Team (FCDIT) needed a way to allow these ships to share real-time information on "best-practices" while supporting the considerable variations even among ships of the identical type and class. XML supports single source publishing, which facilitates content management by keeping all versions of a document in one file. Editors modify a single file to implement updates rather than individually updating scores of files. For a library of 100 technical manuals each with 5 versions and 10 changes per year, the Navy has realized savings totaling $370,000 per year in editing staff costs. Single Source Content Infrastructure for CSOSS is expected to allow a 30% reduction in product delivery effort and equates to a 25% increase in industrial capacity ($1.4M capital value per year for FCDIT alone). The increased industrial capacity will allow a product to be delivered earlier and with greater quality.

As a result of the success of the program detailed above, Naval Sea Systems Command (NAVSEA) began looking at XML as a way to more effectively send maintenance and operations information to ships directly from vendors. The result is Interactive Electronic Technical Manuals (IETM), that contain updated information obtained directly from the vendors of the equipment. When the IETM CD is inserted into any computer, it displays the necessary information and once removed, there is no footprint left on the machine; no one else will know that information was accessed at that terminal. The IETM's are currently focusing on radar, weapons, and propulsion systems. Each CD is created specifically for a particular vessel. CDs will be shipped to approximately 400 vessels (every vessel in the supply ship fleet) and will arrive in three to five days of being burned.

See also the IETM presentation on XML.Gov and Section 4.3.

8. Case Studies

This section delves into considerable details concerning three dramatically different US government applications of XML technology from the IRS, Global Justice Information Network, and GSA. While their only explicit commonality is the use of XML Schema, all three efforts are characterized by rigorous design and development methodologies and the goal of easily exchanging data across organizational boundaries.

8.1 Applying Federal XML Guidance to GSA's Integrated Acquisition Environment

8.1.1 Background

The US Federal government spent roughly $305.5 billion on the acquisition of goods and services in 2003, making it the largest purchaser in the world. There are over 338,000 vendors registered in the Central Contractor Registration (CCR) system, a system that all companies large or small must register with to do business with the US government. However, various aspects of procurement are distributed throughout many of the federal agencies and some functionality is duplicated across multiple agencies. While the Federal Acquisition Regulations (FAR) govern overall acquisition policies for all Agencies, each agency has implemented the FAR through its own set of systems and processes, often resulting in significant duplication.

The vision of the GSA-managed, President's Management Agenda E-Gov Initiative, Integrated Acquisition Environment (IAE), therefore, is to create a secure business environment that will facilitate and support cost-effective acquisition of goods and services by agencies, while eliminating inefficiencies in the current acquisition environment. [8] Specific goals of the initiative are to:

In order to support agency mission performance, IAE program objectives include the following:

Associated web sites include (but are not limited to):

See [IAE] for links to the main IAE initiative site as well as to the OMB IAE page.

8.1.2 High Level Conceptual (To-Be) Architecture

At a high level, IAE consists of four major business areas, depicted along the right side of IAE High Level Conceptual Architecture. Collectively, these business areas encompass roughly 20 fully functional, under- development, or planned systems, controlled by a number of federal agencies (DoD, NASA, GSA, Health and Human Services (HHS), Small Business Administration (SBA), etc.) These applications are considered IAE shared systems because one agency operates the application for the entire federal government. Agency back-office applications, vendor systems, and government or citizen users will eventually interface to the services provided by these business areas by means of the IAE Portal. Currently each agency maintains interfaces with many of the shared systems. One of the main tenants of IAE is that over time the Portal will provide a set of standard interfaces to which agencies can connect. This will make the government's acquisition information systems much more efficient and cost effective. In addition to authentication and authorization, the portal will be responsible for mapping requests from users, vendors and agency applications to formats that a specific shared system understands, routing the request to the appropriate system or systems, coalescing the results that are returned by the system, and finally displaying the merged results to the user (or, if the requester is an application, it will transmit the data to the application).

IAE-high-level-arch.jpg

Conceptually, IAE consists of 4 indicated Business Areas and a Portal which includes Standard Transactions functionality.

Figure 1: IAE High Level Conceptual Architecture

For example, if a contracting officer needs to determine whether a company responding to an RFP has a favorable track record with previous government contracts for similar goods or services, he presently needs to access several different procurement systems with different interfaces and accounts, such as Past Performance Information Retrieval System (PPIRS), Excluded Parties Listing System (EPLS), etc. Once the IAE Portal integrates the numerous procurement-related systems, a simplified, more efficient interface becomes possible. IAE will provide access for Internet-based software solutions, acquisition capabilities and value-added services required to support the entire acquisition lifecycle in a unified and fully integrated manner. Integration Services available through the IAE Portal will provide both public services that are open to everyone (e.g., retrieving publicly available information about current acquisition awards), and secure services whose use is restricted to authorized users/systems (e.g., posting technical data, drawings, and specifications.) These Integration Services will also support both modern near-real time integration and more traditional periodic batch integration for systems that cannot support near-real time.

8.1.3 Standard Transactions Vocabulary and Information Exchanges Development

One of the major precursors for the IAE Portal is the development of a common acquisition vocabulary that agencies, vendors, and IAE shared systems can use and understand. This standard vocabulary will facilitate the mapping to the various shared systems within the IAE environment. The "common language" provided by an IAE module called Standard Transactions is a key enabler for achieving this needed interoperability. Standard Transactions addresses both the kind of information that has to be exchanged among systems (i.e., identifying key standard information exchanges), and what the information means (i.e., defining standard data element semantics and constraints) so that shared systems can communicate more easily with each other, as well as with agency back-office systems and vendors.

The Standard Transactions Module takes a process centric approach to identifying the information exchanges and data elements that support government acquisition. Our work conforms to current industry and government guidelines [see Section 8.1.4]. Each significant step in developing the acquisition vocabulary and information exchanges included an opportunity for all agencies and vendors that provide COTS systems to review and provide feedback. Once published, the standard vocabulary and information exchanges will facilitate the integration of current and future shared system development and maintenance. Data exchange will be facilitated both within and between agencies, and from procurement systems to back-office agency and vendor systems.

Initial meetings with acquisition subject matter experts from many agencies in late 2002 and early 2003 led to the identification of 16 Acquisition Scenarios which describe various use cases that are important to the procurement community. [9] A so-called Backbone Scenario which represents the most complicated type of government acquisition was synthesized from these 16 scenarios to serve as the foundation for data modeling. Each scenario is defined from the perspective of the government acquisition professional in terms of the processes they go through to conduct an acquisition. A given step may have zero or many associated information exchanges. In October 2003, the IAE Program Management Office (PMO) published an initial vocabulary based upon the interactions among five key systems identified in the Backbone Scenario.

Before discussing the data modeling process in more detail, the next several sections cover guidance sources that have influenced IAE Standard Transactions and that are also highly relevant to many US federal and international XML development efforts. Earlier in the modeling effort, the IAE PMO reached out to several individuals who were contributors to early versions of the DRM so that the initiative would be compliant with then-unpublished Federal Enterprise Architecture guidance. We believe that much of this XML-related guidance will also appear in the FEA DRM in due course; see Chapter 3.

8.1.4 Guidance Influencing IAE Vocabulary and XML Schema Development

A primary goal of IAE is to maximize interoperability with other agencies, other federal lines-of-business (financial, etc.), as well as with external emerging standards and vocabularies within industry, state and federal government, and on the international level. To achieve this goal, the IAE Standard Transactions effort is influenced by a number of sources of guidance, as detailed in this section. However, at the same time, since our initiative is an early adopter of many of these emerging standards, it also contributes its element definitions to the organizations that are responsible for defining the standards.

Based on the general approach outlined in the draft FEA DRM (from 2003) and refined in the cited specifications themselves, the IAE PMO has identified the sources of guidance shown in Table 14 that have or will influence our Standard Transactions data modeling efforts.

Sources of Guidance for IAE XML Development and Data Modeling
Object Management Group
Unified Modeling Language (UML) for business modeling and data modeling
ISO Standards

ISO/IEC 11179, Part 5 (Naming and Identification Principles for Data Elements) [ISO 11179]

ISO 15000-5 (see CCTS below)
UN/CEFACT Specifications and Library
Core Components Technical Specification - Part 8 of the ebXML Framework, Version 2.01 [CCTS]
Trade and Business Process Group (TBG): TBG-17: Harmonisation

developing Core Components Library (aka Core Components Catalog)

Applied Technology Group (ATG): UN/CEFACT XML Naming and Design Rules [UN/CEFACT XML NDR]
OASIS Specifications
UBL Naming and Design Rules [unpublished as of this writing, but see [UBL NDR]]
UBL 1.0 [UBL] [to some to-be-determined degree]
W3C Specifications
XML Schema 1.0
Other XML Recommendations as listed in Section 4.3 and the Big Picture of the XML Family of Specifications
US Federal Guidelines
FEA Reference Models, especially the DRM; see Chapter 3
Draft Federal XML Developer's Guide, CIO Council XML Working Group [XML Developers Guide]
Dept. of Navy XML Developers Guide 1.1 and/or DON XML Naming and Design Rules; see Section 4.3
Environmental Protection Agency XML Design Rules [EPA Design Rules]
Recommended XML Namespace for Government Organizations [Namespace Rec for Gov]. Note: IAE has not yet embraced this recommendation.
Notes:
ISO is the International Organization for Standardization. UN/CEFACT is the United Nations Centre for Trade Facilitation and Electronic Business. OASIS is the Organization for the Advancement of Structured Information Standards.

Table 14

With the exception of UML, the various sources of guidance for IAE Standard Transactions and XML Schema development are depicted in the figure, Guidance Influencing IAE. Circles represent organizations that develop data or XML standards, with smaller shaded circles representing committees of the larger body to which they are connected by an arrow. Rectangles represent work products such as specifications or vocabulary. The direction of the arrow indicates how the label should be read, with the arrow extending from the subject to the object of the sentence, and the label serving as a verb. For example, the arrow from "IAE Standard Transactions Vocabulary" to "Core Components (CCTS)" with the label "Is Guided by" means that the Standard Transactions Vocabulary is guided by CCTS. (If you prefer, you can reverse the "Is Guided by" arrows and mentally think of them as "Guides" as in CCTS guides Standard Transactions Vocabulary, much like a UML class diagram.)

Specifically, the diagram illustrates that the IAE Standard Transactions Vocabulary is guided by methodology discussed in the UN/CEFACT Core Components Technical Specification (CCTS), which is itself based on pre-XML data element naming principles in ISO 11179, Part 5. The Core Components Library being collected and catalogued by TBG-17 (the harmonization effort of the Trade and Business Process Group of UN/CEFACT) contributes some elements to the IAE vocabulary (e.g., Address), although IAE is also contributing core components that we define in the acquisition space; TBG-17 is responsible for the harmonization of contributions at an international level. ISO 11179 and CCTS are further discussed in Section 8.1.6 .

IAE-Guidance.png

Development of IAE Standard Transactions Vocabulary and XML Schemas has been influenced by guidance from ISO 11179, OASIS UBL, ebXML Core Components Technical Specification, UN/CEFACT TBG-17, UN/CEFACT XML Naming and Design Rules, W3C XML Schema 1.0, and also Federal XML Guidelines, such from the CIO Council and the Dept. of Navy.

Figure 2: Guidance Influencing IAE XML Schema Development

With regard to XML Schema development, IAE is partially influenced by various federal guidelines which, prior to 2004, have been generally rather limited in certain respects. That is, such guidelines only addressed the kinds of issues listed in Section 8.1.5. Previous guidance left many design and implementation choices up to developers. In contrast, the very recently published UN/CEFACT XML Naming and Design Rules[UN/CEFACT XML NDR] are far more prescriptive, for better or worse. By the time this paper is published, there will probably be two other Naming and Design Rules documents, one from the Dept. of Navy and the other from the OASIS UBL Technical Committee. Exactly how they will differ and how such differences can be reconciled remains to be seen.

In this writer's opinion, while the NDR document offers much value, it also issues a number of dictums without fully explaining why certain choices are made. This is likely to rankle some XML developers. An earlier EPA document did an excellent job of listing several alternatives and then justified why a particular alternative was chosen; see [EPA Design Rules]. This writer recommends that the authors of the NDR promptly publish their justifications for their particular choices, such as prohibiting empty elements, limiting the characters that may appear in an element name, preferring URNs to URLs for namespaces, etc. Fortunately, there is a 16-page UBL 1.0 Naming and Design Rules Checklist that summarizes the rules in various categories, namely:

One could imagine another list with numbering matching that of the checklist that provides the justification or rationale for each of the rules. ("Appendix C. Naming & Design Rules List" of the UN/CEFACT XML NDR document is analogous to the UBL checklist.)

Regardless, IAE Standard Transactions XML Schema development is likely to be influenced by the NDR rules because they are on the fast track for international acceptance. IAE XML Schema will use the elements defined by the Standard Transactions Vocabulary (developed with ISO 11179 and CCTS principles in mind), and tempered perhaps by some subset of the XML Naming and Design Rules. We are exploring the idea of creating both Exchange and Validation schema, the former to permit loose information exchanges for which validation is not essential (at least not at the XSD level) and the latter for cases where rigorous validation is required (e.g., when transmitting data that will update a database). We are also considering another alternative, namely to require the sender to use the Validation schema and the receiver to use the Exchange schema, in keeping with the Robustness Principle of RFC 761. [10]

Another interesting source of information about XML Schema validation appeared on the xml-dev discussion list in September 2004; see Roger Costello's summary of Fallacies of Validation. Although this is not federal guidance per se, it does represent the collective wisdom of several expert XML developers. See also Costello's XML Schemas: Best Practices, also based on xml-dev input.

It should be noted that the NDR guidance is almost exclusively about XML Schema and XML Namespaces. There is also a significant need for guidance in the areas of XSLT and Web services. For details concerning the various Naming and Design Rules documents, see [UN/CEFACT XML NDR], [UBL NDR], [UBL NDR Checklist], the Danish OIOXML Naming and Design Rules 3.0 and the discussion in Section 4.3.

8.1.5 IAE Summary XML Guidance

In August 2003, the IAE PMO published a relatively short document entitled the IAE Summary XML Guidance; the document was subsequently updated in February 2004. With well over 500 pages of federal XML guidance available in 2003, the intention was to distill the guidance down to 15 or so key themes that appeared in the various federal guidelines to date. See [IAE XML Guidance] to obtain a copy.

The purpose of the document was to provide summary guidance; concerning the use of XML technology for the IAE eGov initiative. The goal was to highlight the key points from the major references and Web sites that XML developers should consult for federal guidance and governance. Since this document is a high level summary, the PMO suggested that, whenever necessary, developers should refer to the sources listed at the end of the document for more authoritative and definitive information and governance.

The outline of the summary document appears below to give readers an idea of the kinds of issues it addressed.

However, in light of the recent multiple XML Naming and Design Rules documents, it is clear that the IAE Summary XML Guidance will require updating. It is fortunate that these documents have associated checklists that serve as their own handy summary.

8.1.6 ISO 11179 and CCTS Terminology

While this paper cannot possibly do justice to fully explaining the ISO 11179 data element naming and CCTS principles in any detail, a few brief examples and terminology based on ISO 11179 and CCTS should prove illustrative as an overview. In addition to leveraging the ISO 11179 terminology of Object Class, Property Term, Representation Term, and Qualifier Term, the ebXML Core Components Technical Specification introduces new terms such as Core Component (CC), Basic Core Component (BCC), Aggregate Core Component (ACC), Association Core Component (ASCC), Business Information Entity (BIE), Basic Business Information Entity (BBIE), and Aggregate Business Information Entity (ABIE), some of which are defined below.

First, we consider ISO 11179 terminology used in CCTS:

Core Components Examples
Obj ClassQualObject ClassProp Term QualProperty TermRep Type QualRepresentation Type
EmployeeLast NameText
Payment CardExpiration DateDate
TreeHeightMeasure
CostBudget PeriodT