Abstract
This session will discuss how the "big three" relational database vendors have implemented XML access to relational data. Each vendor's approach will be compared, and tips will be provided for securing and integrating XML data in RDBMS systems.
Keywords
Table of Contents
According to an August 2003 IDC report, the XML server market grew 163.8% between 2000 and 2003. IDC also estimates that the market for XML servers, which includes XML-enabled databases, native XML database servers, XML application servers and XML content servers, will exceed $3.5 billion in annual sales by 2006. The report attributes current and future growth trends to the steady demand for integration technologies and increases in XML functionality. The current crop of native XML database servers and application servers is currently driving the market. Good examples of native XML databases and application servers are Software AG’s Tamino and The Apache Group’s Xindice. Tamino is an XML server which stored data directly as XML, plus provides integration services for XML applications. Xindice is a reference implementation of several W3C standards which stores and serves XML in native format. There are many other products that serve XML database functionality, in fact too many to list here. A fairly complete listing of XML database and server products can be found at http://www.xmlsoftware.com/database.html. Another excellent resource is Ronald Bourret’s XML Database Products page, which lists products by type and category, with a brief description of each. It can be found at http://www.rpbourret.com/xml/XMLDatabaseProds.htm. While the market and demand for XML database products is currently strong, IDC predicts that XML-enabled versions of existing RDBMS servers will become the dominant market driver for XML servers by 2006. Current RDBMS servers are mature, reliable, and robust, and provide an excellent infrastructure on top of which to serve XML. As I outline in my book, the XML Programming Bible (ISBN 0764538292, John Wiley and Sons, 2003), the current crop of RDBMS XML add-ons make relational databases into secure and efficient XML database servers. In this presentation I outline the current XML offerings of the three top-selling RDBMS servers: Microsoft SQL Server, Oracle, and DB2.
I’ve knowingly provided much more detail in this presentation paper than I will deliver in the presentation itself, which is limited to 45 minutes. All of the RDBMS servers outlined in this presentation paper are available on a trial and/or developer license basis, so the examples can be reviewed and tested at your leisure. Also, each of the topics in this presentation paper is covered in much more detail in chapters 18,19, and 20 of my recently published book, the XML Programming Bible (ISBN 0764538292, John Wiley and Sons, 2003). I also provide handy tips on setting up and configuring XML support for each RDBMS environment in the book.
SQL Server 2000 provides XML support via Downloadable SQLXML (XML for SQL Server) add-ons. Support for XML queries and results are provided in the Enterprise Manager and to the Query Analyzer. XML access via HTTP is supported, via the same FOR XML clause, and a few other ways. OPENXML is a feature that permits developers to update a SQL Server with XML documents. This permits SQL Server and related applications to access XML documents like any other type of SQL Server data. XML documents can also be used to process queries. Schema-based XPath expressions can also be used to query data. XML Updategrams permit updating and insertion of XML document data into SQL Server tables. XML bulk-load facilities batch-type insertion of a large volume of XML data. A SQLXMLOLEDB data-access component supports client-side and server-side XML formatting against query results using XPath or SQL, to provide client-side OPENXML functionality. For .NET developers, SQLXML managed classes use the Microsoft .NET Framework class libraries to access .NET Framework classes. These classes can be used to insert XML data and retrieve XML results, including XPath queries and XML templates. We’ll cover XML Templates later in this presentation paper. DiffGrams can also be used from the SQLXML managed classes. These use the DataSet class of the .NET Framework object model. The latest version of SQLXML also supports Web Services via SOAP HTTP requests that can execute stored procedures, user-defined functions (UDFs), and XML templates.
Let’s get into the details of the FOR XML clause, and what data looks like when FOR XML modes are used. FOR XML includes three modes; RAW, AUTO, or EXPLICIit.
This SELECT statement retrieves information from Customers and Orders table in the Northwind database. This query specifies the RAW mode in the FOR XML clause:
SELECT TOP 1 * FROM AmazonListings FOR XML RAW
This query returns the first row in the AmazonListings table. Here’s what the results look like:
<row ProductID="1001" Ranking="1" Title="Hamlet/MacBeth" ASIN="8432040231" AuthorID="1001" Image="http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg" Small_Image="http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg" List_price="7.95" Release_date="1991-06-01T00:00:00" Binding="Paperback" Tagged_URL="http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793 &link_code=xml&path=ASIN/8432040231"/>
As you can see from this example, RAW is a pretty good description of the format of data that comes back. It’s certainly not very readable by human eyes. The row that is retuned is defined by a single element called row, and all of the columns in that row are defined by an attribute with the format columnName=”value”. This result also points out an important problem with RAW mode - the returned value does not convert illegal XML characters to XML-formatted legal characters. For example, the Tagged_URL attribute at the end of the result element contains several ampersands (&), that are used to parse parameters when the URL is sent to the Amazon Web site. RAW mode does not encode them to the ampersand entity reference (&). If you need that functionality, you’ll have to use AUTO or EXPLICIT mode.
Using the XMLDATA option in RAW mode returns the row of data as an XML-Data schema. This includes all of the information in the previous example, plus a couple of XML-data schema names and the data type of each column associated with each attribute. This can be very handy for data sharing or producing schemas.
Let’s look at the same URL, but this time with the AUTO mode:
SELECT TOP 1 * FROM AmazonListings FOR XML AUTO
The only difference between AUTO and RAW in this example is that the row element name has been replaced by the name of the table, AmazonListings, and the ampersands (&) in the tagged_URL attribute have been converted to entity references (&).
<AmazonListings ProductID="1001" Ranking="1" Title="Hamlet/MacBeth" ASIN="8432040231" AuthorID="1001" Image="http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg" Small_Image="http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg" List_price="7.95" Release_date="1991-06-01T00:00:00" Binding="Paperback" Tagged_URL="http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793 &link_code=xml&path=ASIN/8432040231">
AUTO mode is very handy for queries that use related tables, GROUP BY, or aliased tables in the SQL statement. If there were more tables in the query and they were related to AmazonListings, each row from the related table would be a child element of the AmazonListings element. Table alias names are returned in the results used and GROUP BY also has its own elements that are retuned with the result set.
One of the most interesting features of AUTO mode is the ELEMENTS option, which returns results from table columns as elements instead of attributes. A query like this:
SELECT TOP 1 * FROM AmazonListings FOR XML AUTO,ELEMENTS
Returns this format:
<AmazonListings> <ProductID>1001</ProductID> <Ranking>1</Ranking> <Title>Hamlet/MacBeth</Title> <ASIN>8432040231</ASIN> <AuthorID>1001</AuthorID> <Image> http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg </Image> <Small_Image> http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg </Small_Image> <List_price>7.95</List_price> <Release_date>1991-06-01T00:00:00</Release_date> <Binding>Paperback</Binding> <Tagged_URL> http://www.amazon.com:80/exec/obidos/redirect ?tag=associateid& benztechnonogies=9441& camp=1793&link_code=xml&path=ASIN/8432040231 </Tagged_URL> </AmazonListings>
The ELEMENTS option makes the data much easier to read, and may be more compatible with formats that use an element for each data column rather than an attribute.
The EXPLICIT mode offers the most control of XML document formatting returned from an SQL Server URL query. The cost of this flexibility is development and debugging time. EXPLICIT mode is so very explicit that it may even be considered its own XML document formatting language. Because of the uniqueness of the syntax, FOR XML EXPLICIT queries are one of the most difficult parts of SQLXML to master. It’s worth the work, however, because of the control that you have over XML output. FOR XML EXPLICIT queries always start with the following two column assignments:
SELECT 1 as Tag, 0 as Parent
Tag designates the column number of a tag for nesting purposes. Parent designates the nesting level of the column. A Parent value of 0 indicates that there are no parents for this select statement. Both are required values in a FOR XML EXPLICIT query, and both are integers.
The next line formats the first element in a FOR XML EXPLICIT query, and refers to a table column. This example line formats an explicitly named amazon column as an element, with the column value as the text value of the element:
ProductID AS [amazon!1!ProductID!element]
The ProductID AS expression is regular T-SQL, but the rest is part of the FOR XML EXPLICIT syntax. The Exclamations marks separate the FOR XML EXPLICIT arguments in the syntax. amazon is the name of the parent element. 1 denotes the nesting level (1 level below the amazon element). ProductID is the name of the new element. Element tells SQL Server to render the output as an element. Here’s how the output looks:
<amazon>
<ProductID>1001</ProductID>…….
A similar syntax is used to produce attribute output instead of element output:
ProductID AS [amazon!1!ProductID]
The only thing different between this attribute syntax and the previous element syntax is the removal of the !element directive from the end of the argument. This produces the following output format:
<amazon ProductID="1001">
This time, the ProductID becomes an attribute of the amazon element, instead of a nested element. There are several other arguments and directives that can be used with FOR XML EXPLICIT, which I cover in my recent book, the XML Programming Bible (ISBN 0764538292, John Wiley and Sons, 2003).
An SQL Server template is a valid XML document that is stored in an SQL server virtual directory. Templates can contain one or more formatted SQL statements that are returned as a single result XML document. Templates can also be used to return more than one query result as a single XML document and store default parameters. Templates also have security advantages, because the calling URL containing the SQL server query is not directly readable by an HTML page. Here’s an example of the same query, but this time the query contains a reference to a template called MultiQueryExample1.xml instead of a query:
http://iis.benztech.com/XMLProgrammingBible/template/MultiQueryExample1.xml
template is the name of a virtual directory. When you set up an IIS virtual directory, you can specify directories under the virtual directory that will contain templates and schemas. In this case, URL queries are sent to the XMLProgrammingBible directory via the virtual directory settings. URLs that use a template reference are sent to the /XMLProgrammingBible/template directory. URLs that use an XDR or W3C schema reference to process queries with XPath expressions are sent to the /XMLProgrammingBible/schema directory (I’ll cover schema references and XPath queries in more detail later). The template and schema directories were set up when I created the virtual directory for XMLProgrammingBible. They can be named anything, and you can have more than one of each. The virtual directory settings specify if a directory is a schema directory or a template directory. A directory can be a schema or template directory, but not both.
Here’s an example of a very simple template that illustrates the combination of two queries into one XML result document. In this case, the same query runs twice via two sql:query tags, returning the first row (TOP 1)of the XMLProgrammingBible.dbo.AmazonListings table twice in the same result XML document.
<QueryRoot xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <sql:query> SELECT TOP 1 * FROM XMLProgrammingBible.dbo.AmazonListings FOR XML AUTO </sql:query> <sql:query> SELECT TOP 1 * FROM XMLProgrammingBible.dbo.AmazonListings FOR XML AUTO </sql:query> </QueryRoot>
When you save the template with a file name of MultiQueryExample1.xml in the template subdirectory of the virtual directory, you can call the template via this URL:
http://iis.benztech.com/XMLProgrammingBible/template/MultiQueryExample1.xml
An XML document is returned, which contains a result set of two SQL Server rows.
SQL Server also has a couple of ways to automatically transform XML results using an XSL stylesheet. A stylesheet reference can be contained in a URL that references a template with a parameter like this:
http://iis.benztech.com/XMLProgrammingBible/template/MultiQueryExample1.xml? xsl=ResultTransform.xsl
In this example, the ResultTransform.xsl stylesheet is stored in the stylesheets subdirectory of the virtual directory. This is not an official directory for stylesheets, just a directory that I chose to create and store stylesheets in. It could be contained anywhere under the virtual directory and be named anything. The relative path reference branches from the virtual directory root. You can also reference default stylesheets inside a template file using the sql:xsl attribute of a template’s root tag. This has the same effect as the URL parameter.
<root xmlns:sql='urn:schemas-microsoft-com:xml-sql' sql:xsl=’/stylesheets/ResultTransform.xsl’>
XPath queries can be used to access and update SQL server data. Combined with Updategrams and OPENXML, SQL server 2000 becomes a very flexible and robust XML document repository. XPath queries do require some setup, however. You have to create a schema subdirectory under your virtual root using the IIS Virtual Directory Management for SQL Server utility. You also have to create what Microsoft calls an “annotated schema” to map XML data elements and attributes to relational data tables and columns. Annotated schemas are simply regular W3C schemas that use the W3C annotation element (<xsd:annotation>) to contain information about relationships between tables. To facilitate the elements and attributes inside the annotation, an additional namespace must also be added to the schema, xmlns:sql="urn:schemas-microsoft-com:mapping-schema. Under normal circumstances, W3C annotations are used to contain documentation about a certain element or attribute in a W3C schema. For SQL Server annotated schemas, the annotation is located immediately after the root element and namespace declarations. You can also map schema attributes and elements to relational data tables and columns using the sql:relation and sql:field attributes. This is only required if the schema element and attribute names do not match the table and column names in a SQL Server table. An SQL Server table name automatically maps to a complex element type with the same name in the schema. An SQL Server column in a table automatically maps to a simple element or attribute with the same name in the schema. Here’s an example of a very simple annotated schema. This is actually just a regular W3C schema for a table with the urn:schemas-microsoft-com:mapping-schema namespace added. That was the only change I had to make, because the first complex element, <xs:element name="AmazonListings">, contains the same name as the table, AmazonListings, and the attributes in the schema all correspond to column names in the AmazonListings table. The schema represents a single table, so I didn’t need to add an annotation for database relationships either.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:sql="urn:schemas-microsoft-com:mapping-schema"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="AmazonListings">
<xs:complexType>
<xs:attribute name="ProductID" type="xs:short" use="required"/>
<xs:attribute name="Ranking" type="xs:boolean" use="required"/>
<xs:attribute name="Title" type="xs:string" use="required"/>
<xs:attribute name="ASIN" type="xs:long" use="required"/>
<xs:attribute name="AuthorID" type="xs:short" use="required"/>
<xs:attribute name="Image" type="xs:string" use="required"/>
<xs:attribute name="Small_Image" type="xs:string" use="required"/>
<xs:attribute name="List_price" type="xs:decimal" use="required"/>
<xs:attribute name="Release_date" type="xs:dateTime" use="required"/>
<xs:attribute name="Binding" type="xs:string" use="required"/>
<xs:attribute name="Tagged_URL" type="xs:anyURI" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
You can use the sql:relation attribute to explicitly specify the SQL server table name that the AmazonListings element maps to:
xsd:element name="Alist" sql:relation="AmazonListings"
You can also use the sql:field attribute to specify the SQL server column name that the AmazonListings attribute maps to:
<xs:attribute name="PID" sql:field="ProductID" type="xs:short" use="required"/>
There are many other SQL Server annotated schema elements and attributes. We’ll cover some of them later in this presentation paper. The others are well documented in the documentation that comes with the SQLXML download. Now that the schema is defined, it can be saved in the virtual directory that you set up for schemas. I named mine AmazonListings.xsd to match the table that the schema refers to, and saved it in the XMLProgrammingBible\schemas directory. Once the schema is saved, XPath expressions that refer to the schema can be passed from a URL like this:
http://iis.benztech.com/XMLProgrammingBible/schema/AmazonListing.xsd/
AmazonListings[@ProductID=1001]The XPath expression in the above URL is AmazonListings[@ProductID=1001]. The XPath refers to the ProductID attribute of the AmazonListings element with a value of 1001 which maps to rows in the ProductID column in the AmazonListings table that contains a value of 1001. XPath expressions and references to schemas can also be passed from template files. Here’s a template file that contains an external schema reference and the same XPath expression that I used in the previous URL example. The template file is saved in the virtual template directory with a file name of XPathExample1.xml, and the schema file is located in the /schema subdirectory of the root, hence the relative path for the schema file:
<XPathRoot xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <sql:xpath-query mapping-schema=”.\schema\AmazonListings.xsd”> AmazonListings[@ProductID=1001] </sql:xpath-query> </XPathRoot>
The template-based XPath results are accessible via this URL: http://iis.benztech.com/XMLProgrammingBible/template/XPathExample1.xml You can also include a schema in a template file, and refer to the schema with an ID reference instead of a relative path to another file. However, I find this approach leads to very large and complicated template files that quickly become unwieldy. I recommend separating the schema files from the template files whenever possible. Please refer to the SQLXML documentation for more details on including inline schemas in template files if you want to follow this approach.
SQL Server 2000 data can be updated by XML documents using OPENXML, XML Updategrams, and XML Bulk Load. XML Bulk Load is designed to batch-load XML document data into SQL server tables. Behind the scenes, it used the BULK INSERT command to update tables from parsed XML document data. Bulk Load uses MSXML to process streams of XML, which enables it to handle large amounts of XML data relatively quickly and efficiently. While Bulk Load is for large-scale data insertions, XML Updategrams are more suited for small-scale XML data inserts, deletions, or updates. They can be used at runtime, and handle loading of smaller XML documents into tables via very simple SQL Server templates. OPENXML is a flexible option for manipulating XML document data using T-SQL commands.
Like the FOR XML EXPLICIT mode shown earlier in this presentation paper, OPENXML is a language unto itself. OPENXML is a keyword that can be added to T-SQL commands to map and manipulate XML data represented by SQL Server tables and columns. OPENXML is mostly used for updating, deleting or inserting SQL server data from an XML document data source. It can also be used to select and view XML documents in SQL Server tables. There are many options for mapping XML documents to SQL Server tables. You can also integrate ADO with OPENXML to provide a very flexible way for SQL server to communicate with applications using XML instead of regular rowsets. The techniques and syntax for this functionality is covered in great detail in the documentation that comes with SQL Server.
The following code inserts data into three tables from a single XML document:
DECLARE @iDoc int, @cDoc varchar (5000)
SET @cDoc =
'<XMLProgrammingBible>
<Authors AuthorID="1001" AuthorName="Shakespeare, William">
<Sources SourceID="1001" Source_Name="Macbeth">
<Quotations QuotationID="1001" Quotation="When the hurlyburlys done,
When the battles lost and won."/>
</Sources>
</Authors>
</XMLProgrammingBible>'
EXEC sp_xml_preparedocument @iDoc OUTPUT, @cDoc
INSERT INTO [XMLProgrammingBible].[dbo].[Authors]([AuthorID],[AuthorName])
(SELECT [AuthorID], [AuthorName]
FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors') WITH Authors)
INSERT INTO [XMLProgrammingBible].[dbo].[Sources]([SourceID],[Source Name])
(SELECT [SourceID], [Source Name]
FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors/Sources') WITH Sources)
INSERT INTO [XMLProgrammingBible].[dbo].[Quotations]([QuotationID],
[SourceID], [AuthorID], [Quotation])
(SELECT QuotationID, SourceID, AuthorID, Quotation
FROM OPENXML (@iDoc, '/XMLProgrammingBible')
WITH (QuotationID int './Authors/Sources/Quotations/@QuotationID',
SourceID int './Authors/Sources/@SourceID',
AuthorID int './Authors/@AuthorID',
Quotation char(300) './Authors/Sources/Quotations/@Quotation'))
EXEC sp_xml_removedocument @idocThe first line declares two variables. The iDoc variable is used by the sp_xml_preparedocument stored procedure to parse a provided XML document. The cDoc variable contains the XML document that is parsed.
DECLARE @iDoc int, @cDoc varchar (5000)
The XML document that is parsed by the sp_xml_preparedocument stored procedure is usually passed via a parameter to the OPENXML command. As with the first example, I’ve explicitly added the document in the code to make it easier to follow the flow.
SET @cDoc =
'<XMLProgrammingBible>
<Authors AuthorID="1001" AuthorName="Shakespeare, William">
<Sources SourceID="1001" Source_Name="Macbeth">
<Quotations QuotationID="1001" Quotation="When the hurlyburlys done,
When the battles lost and won."/>
</Sources>
</Authors>
</XMLProgrammingBible>'sp_xml_preparedocument accepts the integer variable iDoc and the XML document in the cDoc variable and returns a handle that is used to access the parsed document.
EXEC sp_xml_preparedocument @iDoc OUTPUT, @cDoc
Next, we have the first INSERT command. This insert uses a very simple OPENXML SELECT statement to retrieve the AuthorID and AuthorName attributes from the Authors element in the source XML document. The XPath expression locates the element in the XML document. The WITH Authors command at the end of the OPENXML expression tells OPENXML to use the Authors table as a guide when formatting the node tree.
INSERT INTO [XMLProgrammingBible].[dbo].[Authors]([AuthorID],[AuthorName]) (SELECT [AuthorID], [AuthorName] FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors') WITH Authors)
The Sources INSERT command is almost identical to the Authors INSERT command. This time, the SourceID and Source_Name attributes from the Sources element in the source XML document are retrieved. The WITH Sources command at the end of the OPENXML expression tells OPENXML to use the Sources table as a guide when formatting the node tree.
INSERT INTO [XMLProgrammingBible].[dbo].[Sources]([SourceID],[Source Name]) (SELECT [SourceID], [Source Name] FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors/Sources') WITH Sources)
The Quotations INSERT command has to gather attributes from several elements in the XML document, so unfortunately it can’t use the WITH (table) mapping like Sources and Authors did. Instead, the WITH command contains explicit data typing and mapping. OPENXML explicit mappings gather the Quotation, SourceID, AuthorID, and Quotation attributes from several elements in the XML document. XPath expressions point to positions relative to the root /XMLProgrammingBible element in the XML document.
INSERT INTO [XMLProgrammingBible].[dbo].[Quotations] ([QuotationID], [SourceID], [AuthorID], [Quotation]) (SELECT QuotationID, SourceID, AuthorID, Quotation FROM OPENXML (@iDoc, '/XMLProgrammingBible') WITH (QuotationID int './Authors/Sources/Quotations/@QuotationID', SourceID int './Authors/Sources/@SourceID', AuthorID int './Authors/@AuthorID', Quotation char(300) './Authors/Sources/Quotations/@Quotation'))
The last line in the code resets the connection and removes the handle created by sp_xml_preparedocument using the sp_xml_removedocument stored procedure with a passed parameter of the XML document handle (iDoc).
EXEC sp_xml_removedocument @idoc
Updating XML data with OPENXML is very similar to OPENXML data insertion. The code below updates a row of data in the Sources table with a value provided in an XML document:
DECLARE @iDoc int, @cDoc varchar (5000)
SET @cDoc =
'<XMLProgrammingBible>
<Sources SourceID="1001" Source_Name="McBeth">
</Sources>
</XMLProgrammingBible>'
EXEC sp_xml_preparedocument @iDoc OUTPUT, @cDoc
UPDATE Sources
SET [Source Name] = XS.Source_Name
FROM [XMLProgrammingBible].[dbo].[Sources] S,
(SELECT [SourceID], [Source_Name]
FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors/Sources')
WITH Sources) XS
WHERE S.[Source Name] = 'Macbeth'
EXEC sp_xml_removedocument @idocThis code starts with the same variable declarations as the previous two examples. The XML document that is passed for the update contains a single row of data to update the Sources table. The SourceID stays the sane, but the Source_Name changes.
DECLARE @iDoc int, @cDoc varchar (5000) SET @cDoc = '<XMLProgrammingBible> <Sources SourceID="1001" Source_Name="McBeth"> </Sources> </XMLProgrammingBible>' EXEC sp_xml_preparedocument @iDoc OUTPUT, @cDoc
The UPDATE command contains a nested SELECT statement that pulls the new value out of the attributes in the XML source document and parses them into nodes. The UPDATE command takes the value of the Source_Name attribute and updates the [Source Name] value in the SQL Server table row. The last line in the code resets the connection and removes the handle created by sp_xml_preparedocument using the sp_xml_removedocument stored procedure with a passed parameter of the XML document handle (iDoc).
UPDATE Sources
SET [Source Name] = XS.Source_Name
FROM [XMLProgrammingBible].[dbo].[Sources] S, (SELECT [SourceID],
[Source_Name] FROM OPENXML (@iDoc, '/XMLProgrammingBible/Authors/Sources')
WITH Sources) XS
WHERE S.[Source Name] = 'Macbeth'
EXEC sp_xml_removedocument @idocUpdategrams are very handy for inserting, updating, and deleting table data over the Web. Updategrams are stored in XML templates and XML document data. Updategrams can handle this functionality without using schemas. For example, here’s an example of an Updategram that adds an author authors table:
<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync >
<updg:before>
</updg:before>
<updg:after>
<Authors AuthorID="9999" AuthorName="Miller, Henry">
</updg:after>
</updg:sync>
</ROOT>The value in the after element is added to the Authors table. The XML in the next example is saved as an XML template in the XMLProgrammingBible virtual directory. That way the database doesn’t have to be designated, as it’s already part of the virtual directory properties. All that has to be specified is the table name and the columns to add. Here’s one that changes the author name:
<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync >
<updg:before>
<Authors AuthorID="9999" AuthorName="Miller, Henry">
</updg:before>
<updg:after>
<Authors AuthorID="9999" AuthorName="Mailer, Norman">
</updg:after>
</updg:sync>
</ROOT>This is very similar to the previous example, except that the Author name is replaced in an existing document instead of an insertion of a new row in the table. The row is located using the value in the before element. Now let’s delete the row:
<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync >
<updg:before>
<Authors AuthorID="9999" AuthorName="Mailer, Norman">
</updg:before>
<updg:after>
</updg:after>
</updg:sync>
</ROOT>
Nothing in the after element means that a deletion command is generated by SQL server. Because these are templates, the real value is in specifying parameters from URL queries that are passed to the before and after elements. The SQLXML documentation that cones with the download covers this in detail. Let’s move on to using Updategrams with schemas, which is another good way to get relational data into SQL Server from XML documents. This example uses the same schema as the XML Bulk Load example, named XMLProgrammingBible.xsd. The schema is located in the schemas subdirectory of the XMLProgrammingBible virtual directory on the IIIS server. The schema enables table relationships to be maintained without having to be specified in the template. This time, instead of a raw insert of bulk data, I’m adding a Quotation in the quotations table. Note that the Authors and Sources hierarchy must be defined in the before and the after. The AuthorID attribute from the Authors element and the SourceID from the Sources element are automatically added to the new quotation in the Quotations table. This is because a foreign key relationship is established n the schema that specifies the Quotations table as a foreign key table for each relationship.
<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"
xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync mapping-schema="=”.\schema\XMLProgrammingBible.xsd">
<updg:before>
<Authors AuthorID="1001">
<Sources SourceID="1001">
</Sources>
</Authors>
</updg:before>
<updg:after>
<Authors AuthorID="1001">
<Sources SourceID="1001">
<Quotations QuotationID="1001" Quotation="Is this a dagger which
I see before me, the handle toward my hand? Come, let me clutch
thee: I have thee not, and yet I see thee still. Art thou not,
fatal vision, sensible to feeling as to sight? or art thou but a
dagger of the mind, a false creation, proceeding from the heat-
oppressed brain?"/>
</Sources>
</Authors>
</updg:after>
</updg:sync>
</ROOT>
Oracle support for XML started with Oracle 8i database. XML documents could be included in an Oracle database file system (iFS) and manipulated like a folder based file system. XML documents could be broken down and reassembled from Oracle data based on Oracle's iFS Document Type Definition, which is a proprietary format of the W3C Document Type Definition (DTD). Parsing and reassembly of XML documents was facilitated through Oracle’s own XML document parser, which supported DOM and SAX. The third feature supported in Oracle 8i was XML-based searching in the ConText full-text search engine. Content rating retrieved XML document content and ignores tags, but searches could be tag-based.
Oracle9i has extended these capabilities with more advanced XML database features, such as SQL/XML query support and compatibility with W3C schemas. Also included is a Java application server based on the Apache HTTP Server. Oracle9i standard edition has everything that a developer needs to create XML database solutions. The Enterprise Edition includes more advanced capabilities, such as online analytical processing (OLAP) server support and several features that enable sophisticated data mining, partitioning and clustering. Oracle8i XML features enabled developers to store XML in Oracle databases or parse it into tabular data. Oracle9i extends these capabilities to support full DOM 2 and DOM 3 features such as comments and namespaces. Additional support for W3C schemas helps enforce element and attribute ordering, and other granular XML document structures. Also, performance is enhanced by more advanced support for XML indexes.
Oracle has split XML DB functionality into two groups: Structured XML and unstructured XML. Unstructured data features cater to developers who need to develop XML document repositories for applications that work with unstructured data such as pages on a Web site. Structured data features meet the needs of developer who are working with traditional tabular relational data, but need to manipulate that data as XML. The Oracle9i XML DB contains a set of special SQL functions that allows XML data to be manipulated as relational data. A new data type called XMLType enables storage of XML data as a plain XML document or as a format based on a DOM. XMLType tables and views can be defined using annotated W3C Schemas. The schemas can control how an XML document maps to Oracle data. Windows Explorer can be used with the XML DB Repositories (formerly iFS) to view an XML database as a drive on the file system. A combination of XPath and SQL can also be used to manipulate XML documents. You can also retrieve regular relational data in XML formats and perform an XSLT transformation of the data to text, HTML or custom formats of XML.
Oracle XML DB is a grouping of XML and XPath functions combined with SQL extensions. XML DB features facilitate the manipulation of Oracle data as XML. They also enable XML documents to be stored and queried as Oracle data using the XMLTYPE data type. The Oracle XDK is not required for XML DB functions, if you just want to query XML documents from the Enterprise Manager Console or a third-party query tool. Using XML DB functionality in your applications requires the XDK.
The latest version of the Oracle XDK is available at http://otn.oracle.com/tech/xml/xdk. The XDK is available in Java, C, C++, and PL/SQL versions. At time of this writing, I’m working with the beta release of the version 10 XDK. The Oracle XDK is a set of XML APIS that can be used by developers to incorporate Oracle data into their applications. I’ll cover the XDK in more detail later in this presentation paper.
Oracle developers who are new to XML development are usually under the impression that the Oracle XDK is the only way to access and manipulate XML in Oracle. In fact, there are several options for retrieving XML from Oracle tables, and for storing and retrieving XML documents in Oracle as XML. In this section I’ll review the XML DB functions that read and write Oracle data as XML. I’ll also show you how to manipulate and store XML documents using the XMLType data type. Oracle9i supports several core SQL functions, as well as core SQL/XML functions, several SQL/XML extensions, and a PL/SQL package called DBMS_XMLGEN. Core Oracle XML functions are unique to Oracle and are accessible via SQL*Plus queries. SQL/XML functions are based on the SQL/XML standard, which is a combination of XML and SQL functionality. The SQL/XML standard is maintained by the International Committee for Information Technology Standards (INCITS). INCITS maintains a grab-bag of international hardware, media and other standards, including the original SQL standard in the USA. Specific information on SQL/XML can be found at http://sqlx.org. More information about INCITS can be found at http://www.ncits.org.
For this section, I’ll extract data from an example Amazon book listing table using XML. The easiest way to start is with the XMLFOREST() function. In this query, I retrieve all of the columns in the first row of the AMAZONLISTINGS table:
SELECT XMLFOREST( PRODUCTID, RANKING, TITLE, ASIN, AUTHORID, IMAGE, SMALL_IMAGE, LIST_PRICE, RELEASE_DATE, BINDING, AVAILABILITY, TAGGED_URL) as "RESULT" FROM AmazonListings WHERE rownum = 1;
The following XML document fragment is returned by the XMLFOREST() function. All of the elements are nested at the same level. An XML document needs a single, unique root element to be well-formed XML. Therefore, XMLFOREST() is an easy way to return a fragment that will become part of an aggregated or concatenated XML document, but is not useful for creating XML documents all by itself.
<PRODUCTID>1001</PRODUCTID> <RANKING>1</RANKING> <TITLE>Hamlet/MacBeth</TITLE> <ASIN>8432040231</ASIN> <AUTHORID>1001</AUTHORID> <IMAGE>http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg</IMAGE> <SMALL_IMAGE>http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg </SMALL_IMAGE> <LIST_PRICE>7.95</LIST_PRICE> <RELEASE_DATE>01-JUN-91</RELEASE_DATE> <BINDING>Paperback</BINDING> <TAGGED_URL>http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793 &link_code=xml&path=ASIN/8432040231 </TAGGED_URL>
The easiest way to return a completely well-formed XML document is to nest the XMLFOREST() function inside of the SYS_XMLAGG() Function like this:
SELECT SYS_XMLAGG(XMLFOREST( PRODUCTID, RANKING, TITLE, ASIN, AUTHORID, IMAGE, SMALL_IMAGE, LIST_PRICE, RELEASE_DATE, BINDING, AVAILABILITY, TAGGED_URL)) as "RESULT" FROM AmazonListings WHERE rownum = 1;
This query returns the following XML Document. The addition of the SYS_XMLAGG() function creates an element called ROWSET. The nesting in the SQL query nests the XML document fragment returned by XMLFOREST() under the ROWSET() element.
<ROWSET> <PRODUCTID>1001</PRODUCTID> <RANKING>1</RANKING> <TITLE>Hamlet/MacBeth</TITLE> <ASIN>8432040231</ASIN> <AUTHORID>1001</AUTHORID> <IMAGE>http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg </IMAGE> <SMALL_IMAGE>http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg </SMALL_IMAGE> <LIST_PRICE>7.95</LIST_PRICE> <RELEASE_DATE>01-JUN-91</RELEASE_DATE> <BINDING>Paperback</BINDING> <TAGGED_URL>http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793 &link_code=xml&path=ASIN/8432040231 </TAGGED_URL> </ROWSET>
If you want to give the root element a specific name, instead of the default ROWSET element name, use the XMLELEMENT() function and provide a hard-coded name:
SELECT XMLELEMENT("RootElement", XMLFOREST(
PRODUCTID,
RANKING,
TITLE,
ASIN,
AUTHORID,
IMAGE,
SMALL_IMAGE,
LIST_PRICE,
RELEASE_DATE,
BINDING,
AVAILABILITY,
TAGGED_URL)) as "RESULT"
FROM AmazonListings
WHERE rownum = 1;
This returns the same set of results, but with the customized root element name of RootElement:
<RootElement> <PRODUCTID>1001</PRODUCTID> <RANKING>1</RANKING> <TITLE>Hamlet/MacBeth</TITLE> <ASIN>8432040231</ASIN> <AUTHORID>1001</AUTHORID> <IMAGE>http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg </IMAGE> <SMALL_IMAGE>http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg </SMALL_IMAGE> <LIST_PRICE>7.95</LIST_PRICE> <RELEASE_DATE>01-JUN-91</RELEASE_DATE> <BINDING>Paperback</BINDING> <TAGGED_URL>http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793& link_code=xml&path=ASIN/8432040231</TAGGED_URL> </RootElement>
One of the additional advantages of using a nested XMLELEMENT() function to create a root element is that the XMLATTRIBUTES() function can be used. XMLATTRIBUTES() creates a set of attributes for an element specified as apparent via the XMLELEMENT() function. This time I reuse the query from the last example, but just replace the XMLFOREST() function with an XMLATTRIBUTES() function:
SELECT XMLELEMENT("RootElement", XMLATTRIBUTES(
PRODUCTID,
RANKING,
TITLE,
ASIN,
AUTHORID,
IMAGE,
SMALL_IMAGE,
LIST_PRICE,
RELEASE_DATE,
BINDING,
AVAILABILITY,
TAGGED_URL)) as "RESULT"
FROM AmazonListings
WHERE rownum = 1;The result is a single XML element with an attribute for each column in the first row of the AMAZONLISTINGS table:
<RootElement PRODUCTID="1001" RANKING="1" TITLE="Hamlet/MacBeth" ASIN="8432040231" AUTHORID="1001" IMAGE="http://images.amazon.com/images/P/8432040231.01.MZZZZZZZ.jpg" SMALL_IMAGE="http://images.amazon.com/images/P/8432040231.01.TZZZZZZZ.jpg" LIST_PRICE="7.95" RELEASE_DATE="01-JUN-91" BINDING="Paperback" TAGGED_URL="http://www.amazon.com:80/exec/obidos/redirect? tag=associateid&benztechnonogies=9441&camp=1793 &link_code=xml&path=ASIN/8432040231"/>
The XMLCOLLATVAL() function also produces attributes for column data, but with an important difference. XMLCOLATTVAL() produces an element named column for each column value and an element named name for each column name. The value of the column is the text value for the element. In this example I reuse the query from the last example, but just replace the XMLATTRIBUTES() function with an XMLCOLATTVAL() function:
SELECT XMLELEMENT("RootElement", XMLCOLATTVAL(
PRODUCTID,
RANKING,
TITLE,
ASIN,
AUTHORID,
IMAGE,
SMALL_IMAGE,
LIST_PRICE,
RELEASE_DATE,
BINDING,
AVAILABILITY,
TAGGED_URL)) as "RESULT"
FROM AmazonListings
WHERE rownum = 1;
The resulting combination of elements, attributes and text data could be created with nested XMLELEMENT() and XMLATTRIBUTES() functions, but using XMLCOLLATVAL() is much easier to code:
<RootElement>
<column name="PRODUCTID">1001</column>
<column name="RANKING">1</column>
<column name="TITLE">Hamlet/MacBeth</column>
<column name="ASIN">8432040231</column>
<column name="AUTHORID">1001</column>
<column name="IMAGE">
http://images.amazon.com/images/P/8432040231.01.
MZZZZZZZ.jpg</column>
<column name="SMALL_IMAGE">
http://images.amazon.com/images/P/8432040231.01.
TZZZZZZZ.jpg</column>
<column name="LIST_PRICE">7.95</column>
<column name="RELEASE_DATE">01-JUN-91</column>
<column name="BINDING">Paperback</column>
<column name="AVAILABILITY"/>
<column name="TAGGED_URL">
http://www.amazon.com:80/exec/obidos/redirect?tag=associateid
&benztechnonogies=9441&camp=1793&link_code=xml
&path=ASIN/8432040231</column>
</RootElement>
So far I’ve shown you how to use the XMLFOREST(), SYS_XMLAGG(), XMLELEMENT(), and XMLATTRIBUTES() functions to work with single rows of data. Multiple row result sets are a more of a challenge, because each row should be defined in the XML document. Also, in most cases the XML document has to be well-formed, while containing the multiple row definitions. However, if more than one row is contained in the previous SYS_XMLAGG() example, all elements are contained at the same nesting level. The result is a jumble of row data as elements, with no clear definition of the start and end of a row of results in the XML document. The previous XMLELEMENT() and XMLATTRIBUTES() examples return an XML document fragment with a RootElement for each row, but no XML document root element.
The XMLAGG() function aggregates multiple rows of data into a single XML document. For example, the query below uses nested XMLELEMENT() and XMLFOREST() functions to create a well-formed XML document fragments for each row of data:
SELECT XMLELEMENT("RowElement", XMLFOREST(
PRODUCTID,
RANKING,
TITLE)) as "RESULT" FROM AmazonListingsThe XML result is not a well-formed XML document, but a set of document fragments for each row of data:
<RowElement> <PRODUCTID>1001</PRODUCTID> <RANKING>1</RANKING> <TITLE>Hamlet/MacBeth</TITLE> </RowElement> <RowElement> <PRODUCTID>1002</PRODUCTID> <RANKING>2</RANKING> <TITLE>MacBeth</TITLE> </RowElement> <RowElement> <PRODUCTID>1003</PRODUCTID> <RANKING>3</RANKING> <TITLE>William Shakespeare: MacBeth</TITLE> </RowElement>
The XMLAGG() function aggregates multiple rows of results into a single XML document when you use a query like this:
SELECT XMLELEMENT("RootElement", XMLAGG(XMLELEMENT("RowElement",
XMLFOREST(PRODUCTID,
RANKING,
TITLE)))) as "RESULT" FROM AmazonListingsThe result is a well-formed XML document with a root element and a definition of each row of data. Data rows are children of the RowElement element:
<RootElement>
<RowElement>
<PRODUCTID>1001</PRODUCTID>
<RANKING>1</RANKING>
<TITLE>Hamlet/MacBeth</TITLE>
</RowElement>
<RowElement>
<PRODUCTID>1002</PRODUCTID>
<RANKING>2</RANKING>
<TITLE>MacBeth</TITLE>
</RowElement>
<RowElement>
<PRODUCTID>1003</PRODUCTID>
<RANKING>3</RANKING>
<TITLE>William Shakespeare: MacBeth</TITLE>
</RowElement>
</RootElement>A combination of XMLAGG(), XMLELEMENT(), XMLATTRIBUTES(), XMLFOREST(), and XMLCOMMATTVAL() are the most common ways to represent relational Oracle data as XML.
As of Oracle9i the XMLType data type can be used to store and XML data in Oracle databases. Before Oracle9i, LOBS and text were used to store XML documents as text. At the base level, CLOB and XMLType data is not very different. However, there are several useful methods in the XMLType API that can be used to manipulate XML documents stored as XMLType data types. When relational data tables are mapped to XML document data via W3C schemas, data in the tables becomes available as an XMLType data type. Columns in a regular table can also be XMLType data types, and XMLType views can be used to mask relational data as XMLType data. For more information about XMLType, please refer to the Oracle9i XML Database Developer's Guide - Oracle XML DB chapter 4 (Using XMLType).
XMLType columns can be added to any Oracle table. In this example, I create a table called XMLONLY, which consists of one column, called XMLDOC. I assign the data type for XMLDOC as XMLTYPE:
CREATE TABLE XMLONLY (XMLDOC SYS.XMLTYPE);
With the new XMLONLY table, I can insert XMLType data from any source. In the example below, I select the first three columns of the AMAZONLISTINGS table using nested XMLELEMENT() and XMLFOREST() functions. The query selection is inserted into a variable called XMLTypeVal, which is an in-memory XMLType object. Next, I insert the XMLType object into the XMLDOC column of the XMLONLY table.
DECLARE
XMLTYpeVal SYS.XMLTYPE;
BEGIN
SELECT XMLELEMENT("RootElement", XMLFOREST(
PRODUCTID,
RANKING,
TITLE)) as "result"
INTO XMLTYpeVal
FROM AmazonListings
WHERE rownum = 1;
INSERT INTO XMLONLY (XMLDOC) VALUES (XMLTYpeVal);
COMMIT;
END;
Now that I have some XMLType data in XMLDOC column of the XMLONLY table, I can extract the data with regular SQL functions. For example, this simple select will return the XML document stored in XMLDOC:
SELECT XMLDOC from XMLONLY
Here are the results of the query:
<RootElement>
<PRODUCTID>1001</PRODUCTID>
<RANKING>1</RANKING>
<TITLE>Hamlet/MacBeth</TITLE>
</RootElement>SYS_XMLGEN() is an Oracle SQL function that returns s a single column of a table as an XML element. The Element name is based on the column name. This can be used in limited situations on regular data, but is really useful for XMLType columns. Here’s an example of SYS_XMLGEN() returning an XML element from regular table data:
SELECT SYS_XMLGEN(PRODUCTID) from AMAZONLISTINGS where rownum=1
And here’s the result of the query:
<PRODUCTID>1001</PRODUCTID>
The same query is very useful when applied against XMLType columns. In this example, I query the XMLDOC column of the XMLONLY table and return a single XML document:
SELECT SYS_XMLGEN(XMLDOC) from XMLONLY where rownum=1
SYS_XMLGEN() wraps the column name around the results as an element name:
<XMLDOC>
<RootElement>
<PRODUCTID>1001</PRODUCTID>
<RANKING>1</RANKING>
<TITLE>Hamlet/MacBeth</TITLE>
</RootElement>
</XMLDOC>You may have noticed that there is no XML document declaration at the top of the XML document. This is on purpose. XML declarations are optional; only a root element is required to create a well-formed XML document structure. It’s much better to store the XML document without the optional declaration and create a declaration as part of a query. XML documents without the declaration can be added together to make a larger XML document. If you store the XML declaration as part of the data, it takes up more space, and makes it harder to aggregate XML results into a larger document. Here’s an example of a query that prepends an XML document declaration to the SYS_XMLGEN() results from the previous example:
SELECT '<?xml version="1.0"?>', SYS_XMLGEN(XMLDOC) from XMLONLY
And here are the results, with the XML document declaration added:
<?xml version="1.0"?>
<XMLDOC>
<RootElement>
<PRODUCTID>1001</PRODUCTID>
<RANKING>1</RANKING>
<TITLE>Hamlet/MacBeth</TITLE>
</RootElement>
</XMLDOC>
The Oracle Enterprise manager has facilities for registering generated schemas and mapping XML document data to relational data. XMLType Views can also be used to map XML document data to relational data, and act as an XML interface for legacy data structures without having to alter the tables themselves. In this section I’ll cover both methods. I’ll also cover some of the SQL commands that facilitate data updates.
Oracle can generate W3C Schemas from relational data using XDK functions, the PL/SQL DBMS_XMLSCHEMA.generateSchema() package, or a third party tool such as XMLSpy (a free evaluation is available at http://www.xmlspy.com). For the example below, I used XMLSpy, which can connect to Oracle data via an ADO string using an Oracle OLE DB provider. XMLSpy automatically creates a W3C schema for all tables in a database without needing an object type to be created or referenced. The generated schema can be chopped up and reused. Below is the generated W3C schema for the AMAZONLISTINGS table. The table is represented by an element containing a W3C schema complex data type. Nested inside the complex data type are attributes, some of which contain a W3C schema simple data type. XMLSPY automatically reproduces field constraints and data types based on Oracle constraints and data types in schemas that are generated from Oracle databases. For example, the Title column is a W3C schema string data type, and has a maximum length of 200.
<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSPY v5 rel. 4 U (http://www.xmlspy.com) by
Brian Benz (Wiley) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="AMAZONLISTINGS">
<xs:complexType>
<xs:sequence>
<xs:element name="PRODUCTID" type="xs:decimal"/>
<xs:element name="RANKING" type="xs:decimal"/>
<xs:element name="TITLE">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="200"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="ASIN">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="10"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="AUTHORID" type="xs:decimal"/>
<xs:element name="IMAGE">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="SMALL_IMAGE">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="LIST_PRICE" type="xs:decimal"/>
<xs:element name="RELEASE_DATE" type="xs:dateTime"/>
<xs:element name="BINDING">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="50"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="AVAILABILITY">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="10"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="TAGGED_URL">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:maxLength value="200"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Now that the schema is registered in the database and a table is created based on the schema, I can test the new table by sending it a sample XML document. XMLSpy has a menu option for generating a sample document, which I use as a guide to build a real test document. Next, I wrap an SQL INSERT command around the XML document like this:
INSERT INTO 'AMAZONLISTINGSSCHEMATABLE VALUES(sys.XMLType.createXML(
'<?xml version="1.0" encoding="UTF-8"?>
<AMAZONLISTINGS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="'AmazonListingsSchemaTable.xsd">
<PRODUCTID>1004</PRODUCTID>
<RANKING>4</RANKING>
<TITLE>Ulysses</TITLE>
<ASIN>0679722769</ASIN>
<AUTHORID>1002</AUTHORID>
<IMAGE>http://images.amazon.com/images/P/0679722769.01._PE30_PIdp-
schmoo2,TopRight,7,-26_TCMZZZZZZZ_.jpg</IMAGE>
<SMALL_IMAGE>http://images.amazon.com/images/P/0679722769.01._PE30_PId
p-schmoo2,TopRight,7,-26_SCMZZZZZZZ_.jpg</SMALL_IMAGE>
<LIST_PRICE>11.90</LIST_PRICE>
<RELEASE_DATE/>
<BINDING>Paperback</BINDING>
<AVAILABILITY/>
<TAGGED_URL>http://www.amazon.com/exec/obidos/ASIN/0679722769/qid=1050
987099/sr=2-1/ref=sr_2_1/104-2508041-6900765</TAGGED_URL>
</AMAZONLISTINGS>'));The XML document is accepted into the database and stored as relational data. At the same time, the XML schema checks to see if the document is valid, and issues an error if it is not. For example, if the TITLE exceeds 200 characters, the INSERT command returns the ORA-22814: attribute or element value is larger than specified in type error message. You can extract XML using a regular XML expression, provided you return all of the columns in a row. For example, this query returns the first row of data as an XML document:
select * from AmazonListingsSchemaTable WHERE rownum = 1;
However, because the database is based on a schema and not a regular table structure, a regular query that returns less than a full XML document is not permitted. For example, a query to return the PRODUCTID column:
select PRODUCTID from AmazonListingsSchemaTable WHERE rownum = 1;
Returns an error message that says ORA-00904: "PRODUCTID": invalid identifier. Because the XML data is based on a schema, the entire table looks like an XMLType object to queries. XMLType objects can be queried using XPath expressions, which are part of the EXTRACT(), EXISTSNODE(), and UPDATEXML() functions.
Because the AmazonListingsSchemaTable table is an XMLType object, it can be updated using XPath expressions and the UPDATEXML() command. The following example updates the Availability column in the table by updating the Availability element in the XMLType representation of the column:
UPDATE AmazonListingsSchemaTable x SET value(x) = UPDATEXML(value(x), '/AMAZONLISTINGS/AVAILABILITY', 'Out of Stock') WHERE existsNode(value(x),' /AMAZONLISTINGS[PRODUCTID="1004"]') = 1;
XMLType views are similar to XMLType tables, but do not require the tables represented by the view to be changed to accommodate XMLType queries. You can use a table for regular data processing, and have the option of viewing the same data as XML via the XMLType view. XMLType views can be created using most of the functions that I have covered so far in this presentation paper, such as XMLELEMENT() and XMLFOREST(). To create an XMLType view without a schema, use the CREATE OR REPLACE VIEW <view name> OF XMLTYPE SQL command. You can also map an XMLType view to a W3C schema. Schemas can validate XML data going in and format XML data coming out of the view.
Another interesting feature of the Oracle XML DB is the ability to present XML Type data objects as folders and files in a hierarchy over standard protocols such as HTTP, WebDAV, and FTP. XMLType objects can be queried and updated via XMLType-compatible SQL commands. For more information on setting up the server, client and databases for theis feature, please refer to the Oracle9i XML Database Developer's Guide - Oracle XML DB chapter 13 (Oracle XML DB Foldering).
The SYS_XMLAGG() and SYS_XMLGEN() functions accommodate XML document format definitions using the XMLFormat object type. If you are using SYS_SMLAGG and have a W3C schema created for your XML data, you can create an XMLFormat object using the createFormat() function. Here’s an example of a query that formats output from the AMAZONLISTINGS table based on a W3C Schema named AmazonListingsSchemaTable.xsd:
SELECT SYS_XMLAGG(XMLFOREST(
PRODUCTID,
RANKING,
TITLE)) as "RESULT"
FROM AmazonListings
WHERE rownum = 1
XMLFORMAT.CREATEFORMAT('AZLIST',
'http://schemas.benztech.com/'AmazonListingsSchemaTable.xsd'));You can also use the createFormat() function to specify the root tag of a SYS_XMLGEN result. In this example, the XMLONLY table contains one XMLType column called XMLDOC. The results of this query returns the AZLIST parameter as the root element for the SYS_XMLGEN() XML document output.
SELECT SYS_XMLGEN(XMLDOC, XMLFORMAT.CREATEFORMAT('AZLIST'))
FROM XMLONLY WHERE rownum = 1;You can also create a hard-coded XMLFormat object using createFormat() and several related attributes. Hard-coding the XMLFormat object type is useful for generating an ObjectType that references a schema. The only exception to this would be to create XML document output from an XMLType data type that includes a processing instruction. In this case, the instruction can be added to the XML output using the processingIns attribute of the XMLFormat object.
Oracle9i maintains a very complete set of XML DB functions for PL/SQL, Java and C++ in the XDK. However, PL/SQL developers who don’t want to use the XDK are definitely not neglected. Many XDK XML document functions have comparable PL/SQL functionality. For example, the XMLType API includes the createXML() function for generating XML from a string and 20 other functions for manipulating XMLType data types.
There is also a PL/SQL DOM Parser (DBMS_XMLDOM), a fast validating parser for XMLType and CLOB data types (DBMS_XMLPARSER), and an XSLT Processor (DBMS_XSLPROCESSOR). Oracle also includes packages for registering W3C schemas (DBMS_XMLSCHEMA) and several views for reviewing schemas and their user assignments (Oracle XML DB XML Schema Catalog Views). Administrators and developers alike will appreciate the Resource API for PL/SQL (DBMS_XDB) for managing XML DB security, and the DBMS_XDB_VERSION API for managing version control. The RESOURCE_VIEW and PATH_VIEW views accommodate access to Oracle data via third party tools using JNDI, FTP, or WebDAV. The DBMS_XDBT API facilitates maintenance of ConText indexes for an XML DB instance.
All of these APIS and associated functions are well documented in appendix F (Oracle XML DB XMLType API, PL/SQL and Resource PL/SQL APIs: Quick Reference) of the Oracle9i XML Database Developer's Guide - Oracle XML DB. For this reason, I won’t drill down any deeper into these APIS. Instead I’ll show you several important tips and tricks for writing XML documents to the file system using another important PL/SQL package: DBMS_XMLGEN().
The Oracle XDK includes DOM and SAX Parsers with support for W3C Schemas, and a customized high-performance XSLT Processor. Java and c++ developers can use the XML Class Generator to generate classes from DTDs and Schemas. The generated classes can be used to send XML documents to Oracle databases. XML Java Beans provide a visual tool for exploring and transforming XML documents. Java developers can use the XML SQL Utility to create XML documents, from SQL queries. You can also create DTDs and Schemas for XML result sets. The XSQL Servlet can be used with the Oracle Java VM or another application server, including the Oracle AS (Application Server) to manipulate XML using SQL and XSLT. The XML Pipeline processor enables the combination of queries and other Java processes, and the TransX Utility facilitates XML document to Oracle data loading.
Although the XDK is available in Java, C, C++, and PL/SQL versions, the most complete feature set is in the Java XDK. The Oracle JDBC API is the standard way to access Oracle data from J2EE applications. JDBC supports reading and Writing of data from Java to external data sources. JDBC is based on the X/Open SQL call level interface (CLI) specification. More information on the X/Open SQL CLI can be found at http://www.opengroup.org.
Oracle9i also supports SQLJ. SQLJ supports embedded SQL queries in Java code, based on SQLJ syntax. A key component of SQLJ is a code generator that converts SQLJ statements in Java source code with calls to the Oracle JDBC driver. The generated Java code can call Oracle database objects and return results via JDBC. This saves some time in coding, but requires a SQLJ runtime engine to run on the Oracle server. More information about SQLJ can be found at http://www.sqlj.org.
The XSQL servlet processes SQL queries that are formatted in XML documents and returns results as XML. XSQL functionality can be combined with the Oracle XML Parser for Java, the XML- SQL Utility (XSU), and the Oracle XSL Transformation (XSLT) Engine to produce complex XML and HTML pages. This combination of tools is known as the XSQL Pages Publishing Framework. XSQL is compatible with most J2EE application servers. JSP pages can also include calls to the XSQL servlet via <jsp:forward> and <jsp:include> tags. For more details on XSQL including installation, setup and configuration instructions, please refer to the Oracle9i XML Developer's Kits Guide - XDK chapter 9 (XSQL Pages Publishing Framework). XSQL queries are stored in XSQL page template files, which are defined by the .xsql file extension. Here’s an example of a basic XSQL page template, called GetProduct.xsql:
<?xml version="1.0"?> <xsql:query connection="xsqlconnect" bind-params="PRODUCTID" xmlns:xsql="urn:oracle-xsql"> SELECT PRODUCTID, RANKING, TITLE FROM AmazonListings WHERE PRODUCTID = ? </xsql:query>
Once the GetProduct.xsql file is created and stored under your web server's virtual directory hierarchy, you can access the template via URL:
http://<J2EE server URL>/GetProduct.xsql?PRODUCTID=1001
The servlet registers the .xsql file and creates a servlet parameter from the PRODUCTID URL parameter. The data is automatically formatted as XML. Here’s an example of default XML output from the XSQL servlet:
<?xml version="1.0"?>
<ROWSET>
<ROW num=”1”>
<PRODUCTID>1001</PRODUCTID>
<RANKING>1</RANKING>
<TITLE>Hamlet/MacBeth</TITLE>
</ROW>
</ROWSET>Changing the default SQLX-generated element names to something other than ROWSET and ROW, or changing the XML output to another document format or HTML requires XSL transformation. To automatically transform XSQL servlet output using a stylesheet, add the following like to the page template file, just above the xsql:query element:
<?xml-stylesheet type="text/xsl"
href="<URI reference to the stylesheet>"?>The XSQL servlet uses the XML SQL Utility (XSU) to generate XML document output. XSU functionality can also be added to any other servlet on a J2EE application server. The XSU is an API that can run on a J2EE server. Other interfaces to XSU run on a command line and through PL/SQL. The XSU can generate XML output from SQL queries, generate DTDs and W3C schemas, and perform XSL transformations on XML document output. XSU query output can be returned as an XML document, a DOM node tree representation, or a series of SAX events. The code listing below shows a sample Java class that uses the XSU API OracleXMLQuery class to create a simple XML document from the AMAZONLISTINGS table:
import java.sql.*;
import oracle.jdbc.driver.*;
import oracle.xml.sql.query.*;
class GetAmazonListings {
public static void main(String[] argv)
{
try{
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
Connection conn = new
oracle.jdbc.driver.OracleDriver().defaultConnection ();
OracleXMLQuery qry = new OracleXMLQuery(conn, "SELECT PRODUCTID,
RANKING, TITLE FROM AMAZONLISTINGS WHERE ROWNUM = 1");
String str = qry.getXMLString();
System.out.println(str);
qry.close();
} catch(SQLException e){
System.out.println(e.toString());
}
}
}The code starts by importing the standard Java SQL package, then the Oracle JDBC driver classes, and the XSU classes, including the OracleXMLQuery class:
import java.sql.*;
import oracle.jdbc.driver.*;
import oracle.xml.sql.query.*;
class GetAmazonListings {
public static void main(String[] argv)
{This JDBC connection illustrates the connection string required for code that will run in the Java server on the Oracle server. The JDBC driver on the Oracle server runs on a default session, so no name and password are required as part of the connection string. The defaultConnection() method of the oracle.jdbc.driver.OracleDriver class retrieves the default session information. The XSU OracleXMLQuery class uses JDBC to make the connection to the Oracle server instance, and then returns the data as XML when it receives the result set from the JDBC driver.
try{
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
Connection conn = new
oracle.jdbc.driver.OracleDriver().defaultConnection ();
OracleXMLQuery qry = new OracleXMLQuery(conn, "SELECT PRODUCTID,
RANKING, TITLE FROM AMAZONLISTINGS WHERE ROWNUM = 1");Once the result set is retrieved and converted to XML by the OracleXMLQuery class, the resulting XML document can be retrieved via the getXMLString() method. Next, the OracleXMLQuery is closed to complete the class.
String str = qry.getXMLString();
System.out.println(str);
qry.close();