Abstract
XML processing issues have often been cited as a major reason for limited usage of XML in high-performance systems. This paper presents a case study of a large XML-based performance-driven web application based on a production deployment at a financial services firm. It introduces the concept of a 4-tier architecture, also referred to as the XML tier. In addition to the web servers, application servers and database servers, the XML tier, powered by hardware XML appliances, decouples XML Processing from the application server and database infrastructure.
This use case demonstrates that using XML in web applications can actually result in a scalable and cost-efficient architecture that enabled the financial institution to meet its performance and business requirements as the application and the number of dependent users rapidly increased.
These advantages are made possible by decoupling XML processing through integrated hardware XML appliances designed to process XML efficiently. In this particular case, such deployment has resulted in a tenfold throughput increase with 90% cost savings over scaling the general-purpose server infrastructure. This resulted in over $5 million saved in less than six months.
Keywords
Table of Contents
XML related standards have moved rapidly from standard bodies into live Enterprise environment. Enterprise Applications use XML to integrate applications, to present data on browsers and to exchange data with business partners using XML Web Services standards. Despite this wide industry acceptance, XML processing issues have often been cited as a major reason for limited usage of XML in high-performance systems. General-purpose servers do not perform these functions with acceptable performance levels at affordable costs.
The typical Enterprise Web Application is made up of three layers with independent functions and relative loose coupling. Most web applications need to access enterprise data, stored in database systems. This is referred to as the data layer. The second layer, aptly called middleware or business logic layer, retrieves the relevant data from the database layer and applies the relevant business logic in order to serve the purpose of web or web service application. Several middleware solutions provide the ability to model the data in XML and provide a DOM (Document Object Model) interface to interact with the content. It then uses one of several XML based standards such as XSLT, XML Schema, XML Digital Signatures and XML Encryption based on specific business requirements for data presentation, validation, and security. The third layer is the protocol handling layer, in most cases, it consists of web servers terminating Hypertext Transfer Protocol (HTTP) and Secure Socket Layer (SSL) protocols and the ability to store and access the static content. This paper introduces a 4th tier, called the XML tier, to this 3-tier architecture. It describes the advantages and challenges of adding this XML tier to the Enterprise Infrastructure by using a specific use case where the XML tier has been deployed successfully.
The next section defines the different XML performance issues and provides examples to show the limitations on applications due to extensive XML processing. Section 3 introduces a few use cases where the XML processing impact is severe. Section 4 introduces the concept of a 4-tier architecture, with the introduction of the XML tier to the classic 3-tier web architecture. Section 5 describes, through a use case, the implementation of this new approach using hardware XML appliances to accelerate XML processing, security, and routing at a fraction of the cost of a comparable server infrastructure. Section 6 describes the performance results in the particular use case of introducing the 4th tier. Section 7 details the lessons learnt in this deployment, the unexpected other benefits and limitations. Finally, section 8 lays out the conclusion.
To understand the scaling problems with such XML based applications, we consider the processing overheads for the three XML processing examples: XSLT transformation, XML Schema Validation, and XML digital signature verification. For XML transformation, Saxon XSLT engine is used, for XML schema validation, Sun’s MSV validator is used and for XML signature verification AddInt web service is used.
The results here are typical of XML/SOAP based enterprise applications. This calculation is based on a 1.8GHz Intel Machine and makes some simplifying assumptions such as dedicated single processor, 100Mbps link with 80% link utilization and no additional XML processing requirements.The maximum network capacity is calculated by dividing the available Mbits/sec (Mbps) using 80% link utilization and dividing it by the average of input and output size (if relevant) in Mbits. With performance tests run to exclude the impact of any startup or caching costs, the processor capacity is derived.
1. www.docbook.org
2. MISMO request version 2.1 from www.mismo.org
3. AddInt web service with signature verification using server toolkit
4. Transactions/second
5. Calculated as (100*.08) / (((input + output size)/2)*.008)
The unutilized transactions/second (TPS) is calculated by dividing the maximum processor capacity by maximum network capacity. The unutilized transactions per second is a measure of inefficiency of the architecture to process enough transactions and as a result scaling such application requires additional server hardware, software and the associated server overhead costs of managing and maintenance.
With more optimized implementations of transformation, validation and signature processing, and through higher clock rate processor as well as use of multiple processors, these numbers should look a lot better. However, along with increased computing power, Enterprises are also adopting Gigabit network interfaces on the servers, making the comparison look poorer still. Further, the server infrastructure is typically saddled with significant business logic and data access processing that leaves less computing cycles for XML processing requirements.
Not all applications using XML or Web Services have a performance issue related to XML processing. However, in large Enterprises, there are several use cases where XML processing is a pressing problem, either due to the volume or the latency driven by a service level agreement. The following is a sampling of such use cases:
A large wireless service provider uses web services to integrate with a service provider that resells the wireless provider’s service, and takes in the subscriber information to generate a service order. The Document style web service requests need to be validated and routed based on subscriber information in the soap BODY and responded to within the agreed upon Service level agreement.
A large financial institution with worldwide distributed sales offices takes into web requests from the sales offices with 128kb/sec connectivity, generates complex 100KB+ XML in the application server environment, applies complex XSLT transformations that result in HTML to be send to the sales offices. The pre-dominant current bottlenecks are the infrastructure requirement for XSLT processing as well latency issues.
A large financial institution, servicing millions of users, takes in detailed sensitive financial information from its web site, and then digital signs the responses uses XML digital signatures to be sent back on secure connections. The memory and processing requirements to process large volumes of such transactions, each of which require creation of a DOM tree, and then executing XML digital signature are critical bottlenecks.
As is evident from the examples, the performance bottlenecks typically shows up in high-volume or latency sensitive applications in large enterprises. It is quite likely that a several XML processing issues do not have significant enough impact on overall system performance to require a hardware tier in the network.
The typical 3-tier architecture of web applications consists of web servers, application servers and a database layer that is typically a relational database system. The architecture allows for decoupling of unassociated functionality. The database layer typically I/O bound and better performance is derived through query results caching, memory management and disk I/O optimization. The application server layer is sensitive to complexity of application logic, number of concurrent sessions it has to manage and availability concerns. When saddled with XML processing tasks such as data presentation using XSLT, application server is also sensitive to CPU and memory availability as well as CPU caches. The web server layer is typically sensitive to number of concurrent connections it has to support. This architecture, allows for separation of concerns with the exception of XML processing.
In such architectures, XML appliances are placed as the 4th tier where the classic Layer 4-7 switches route XML traffic to the XML appliances. Application servers, which are at the center of 3-Tier architectures, are the main producers and consumers of XML data. They benefit the most by the acceleration XML appliances provide. This is as a result of application server CPU cycles and memory savings as well the hardware and software acceleration provided by the XML tier devices.
From functionality perspective, in the context of 4-Tier architectures, XML traffic can be looked upon as inbound or outbound. Inbound XML acceleration requires that the data be decrypted, signatures verified, validated, transformed, routed for quality of service and parsed into serialized objects. Outbound XML acceleration requires that the data be encrypted, signed, transformed and compressed.
The following use case describes the data flow.
This case study describes an environment where XML processing was the key bottleneck in the classic 3-tier web application architecture. This large financial institution had adopted XML/XSLT-based presentation tier in the web applications as opposed to a tightly coupled approach such as JSP (Java Server Pages) in order to have a complete separation of presentation and application logic. Such separation provided the ability to have multiple views that relied upon same underlying application logic and data layer. In addition, as application logic and data layers are self-contained, making major changes to the underlying application server results in limited changes on the presentation tier.
In early benchmark testing, the analysis of the results pointed to the significantly high CPU utilization (80% after all other optimizations) and heavy memory requirements on the application servers tier due to XSL transformations. The data size generated from the database tier ranged from 20KB-300KB and the actual XSL transforms ranged from 100KB-200KB in size with varying complexity. The XSL files were a component of the Application Server environments. This further added a production complexity in that simple User Interface changes required Application Server modifications.
The XML Appliance from Sarvega was evaluated as the 4th tier in the environment. Specifically, the impact on the application server CPU with increasing load factor, with the introduction of XML appliances was evaluated. The data flow in shown below:
The main processing steps in the data flow are:
L4-7 Load balancer routes relevant requests to the XML tier.
The XML Appliance terminate HTTP 1.1 GET and POST requests from the client web browser HTTP Request.
The XML Appliance classifies the input request to a specific internal XMLprocessing pipeline based on either the protocol headers.
Retrieve and pre-process the XSL template and any other XSLs referred to within the primary XSL file based on the XSL template name in the HTTP headers.
Route the result to the back-end web server or application server. The routing mechanisms include routing to an active web server farm and a standby web server farm, through a Virtual IP Address for each pool. In this situation, such appliances can execute ICMP ping, TCP/IP ping or Application ping and are able to confirm if application server environment is available for service.
Modify the Application server to return XML rather than HTML (most application servers refer to this as the client-rendering mode). After the application server responds, the XML appliance executes XML transformation.
When an internal stylesheet error has been detected within the appliance while transforming the XML content delivered in the HTTP response from the web server, an error stylesheet could be invoked to formulate an appropriate response to the client web browser.
The benchmark results show that with the XML Tier powered by the Sarvega's hardware accelerated XPE Appliance, the CPU utilization of a 24-CPU processor server farm is brought down from 37% to 4% using a single hardware appliance. An important point to note for this case study is that a 24-CPU application server farm was providing only 5% of required transactions per second. A single hardware appliance with additional CPU cycles to spare can now accomplish the same.
Figure 4 and 5 compare the response time experienced by the client with gradually increasing transactions per second. The results shown are results of several benchmarking runs in order to detect any variations in performance due to unknown problems in the respective environments. Figure 4 is based on results using an XML tier while Figure 5 results exhibit behavior observed with an application server executing XSLT transformations. Because the system resource impact of transformations and business logic have been separated, optimized and scaled independently, the XML tier architecture shows a more predictable performance with a clear path for linear scalability.
From an operational perspective, this makes predicting the scalability costs of applications far simpler. Additionally, the User Interface specific modifications now can be executed in production deployment without bringing down application servers.
The XSLT implementations that most products are using today have varying degrees of support for the XSLT 1.0 standards. For example, in James Clark’s xt engine, single attribute value template (avt) allows a malformed avt to be used. When the open brace is doubled to indicate a literal brace, the trailing close brace should also be doubled. However, xt does not have this requirement.
Such issues are not an xt specific issue – standard adoption and interpretations differ ever so slightly in different implementations. In most cases, while people recognize the problem, the solution of “becoming standard compliant” is typically not the primary concern to the users in the real world. As an XML tier, the devices have to understand, interpret and process using the knowledge of these variations in the implementations – or else, they will not be useful to the real world applications.
In mission critical applications, the architects that develop these applications isolate the development, testing and production environments. As a result, the hardware and software requirements multiply. The XML Tier solutions should be able to support multiple environments through a single device in order not to keep the cost impact to a minimum.
Like in most technology implementations, technologists always find a way to do tasks more efficiently. Beyond accelerating XSLT and isolating presentation tier from the application servers, in this use case, XML appliances also provide the ability to compress the HTML content (typically in 100KB range) before responding to the client. As the content is primarily text, the compression ratio is high. For the narrowband client sharing a 128Kb link, this provides significant latency reduction.
Another key operational benefit in this use case came as a result of using the XML appliance to retry the request to the web server in case of failure. For reasons beyond the scope of this paper, the application servers occasionally were not ready to take in new requests and returned a failure. On detection of the HTTP response, the XML appliance would retry the request and as a result the number of errors returned to the end-client were reduced.
Additional benefit came from the fact that the XML appliance acted as proxy to all requests and inspected all the content, including protocol headers and content body. This resulted in wealth of log information becoming available at the XML appliance that provided per-request profiling of response times – something that generic performance profiling systems were not able to deliver!
Now that XML appliances are deployed in this large institution, as they move to web services deployments, they can leverage the inbound XML processing capabilities of these devices. Document/Literal style web services can be validated for schema conformance, Web Services Security credentials can be checked, signature verification performed and content decrypted before application servers process the request. The XML appliances now can be used as Web Services gateways.
While there are several architectural and performance benefits of the XML tier in the high-performance applications, introducing a new network device has challenges of manageability and reliability. With the risk of another potential point of failure in the network, only high reliable and manageable solutions that integrate with existing Enterprise Infrastructure management mechanisms may be acceptable in Enterprise applications. Additionally, XML tier devices introduce some additional latency for each request. For performance gains to show through, this latency overhead needs to be minimal as compared to overall request latency.
XML is critical to web applications, which have become an integral part of the corporate IT infrastructure. The flexibility of XML, however, comes at high costs. There is a significant overhead associated with processing XML – in some instances up to 80% of general server infrastructure use is taken up by XML – translating into very high IT investments for software, server hardware, and operational costs. There are architectural and operational advantages in separating the XML processing on a 4th processing tier – XML tier. These advantages are made possible by decoupling XML processing through the use of integrated hardware XML appliances designed to process XML efficiently.
A special thanks for John's reviews and inputs to make this paper possible.
Sunil GaitondeA special thanks for Sunil's reviews on this paper
Sarvega TeamMost of information in the paper comes from various benchmarks and tests done within Sarvega Engineering team. I am thankful for their support.
![]() ![]() |
Design & Development by deepX Ltd. |