Abstract
The concept of atomic transactions has played a cornerstone role in creating today's enterprise application environments by providing guaranteed consistent outcomes in complex multi-party business operations. While numerous multi-party business applications involve various patterns based on atomic transactions, it was not until recently the term "business transactions" accumulated any concrete meaning. Rapid developments in the Internet infrastructure and protocols has yielded a new type of application interoperation that makes concepts which could only previously be considered in an abstract form an implementation reality. The effects of such changes have been felt most strongly in business environments, fueling the mindset for a transition from traditional atomic transactions to extended transaction models better suited for Internet interoperation. The first real attempt at this was the OASIS Business Transaction Protocol in 2001; this was then followed by the Web Services Transactions specification from IBM, Microsoft and BEA in August 2002, and more recently by the Web Services Transaction Management specification from Sun, Oracle and others in August 2003. Although these specifications have some obvious commonality in their approaches, there are significant differences. Unfortunately for developers, the release of three competing specifications has effectively frozen the market. Why are these different specifications required? Have application requirements changed so much over the past two years? Are more specifications going to arise? In this paper we'll try to address some of these questions and give an indication of where we think things will go in the future.
Keywords
Table of Contents
The concept of atomic transactions has played a cornerstone role in creating today’s enterprise application environments by providing guaranteed consistent outcome in complex multiparty business operations and a useful separation of concerns in applications. While numerous multiparty business applications involve various patterns based on atomic transactions in order to solve non-trivial business problems, it was not until recently the word “business transactions” accumulated any concrete meaning. Rapid developments in Internet infrastructure and protocols has yielded a new type of application interoperation that makes concepts which could only previously be considered in an abstract form an implementation reality. The effects of such changes have been felt most strongly in business environments, fueling the mindset for a transition from traditional atomic transactions to extended transaction models better suited for Internet interoperation.
Most business-to-business applications require transactional support in order to guarantee consistent outcome and correct execution. These applications often involve long running computations, loosely coupled systems and components that do not share data, location, or administration and it is thus difficult to incorporate traditional Atomic, Consistent, Isolated, Durbale (ACID) transactions within such architectures. For example, an airline reservation system may reserve a seat on a flight for an individual for a specific period of time, but if the individual does not confirm the seat within that period it will be unreserved.
In 2001, a consortium of companies including Hewlett-Packard, Oracle and BEA began work on the OASIS Business Transaction Protocol (BTP)which was aimed at business-to-business transactions in loosely coupled domains such as Web Services [BTP]. The specification developed two new models for “transactions”, requiring business-level decisions to be incorporated within the transaction infrastructure. By April 2002 it had reached the point of a committee specification. Hewlett-Packard quickly followed this with the first commercial product based on this specification.
However, notable by their absence from BTP were Web Services heavy weights IBM and Microsoft, who in August 2002 released their own specifications: Web Services Coordination (WS-C)[WS-C] and Web Services Transactions (WS-T)[WS-T]. Although the WS-T specification also defines two “transaction” models, they are both very much grounded in existing transactional infrastructures. In particular, business applications would use WS-T in the same way applications use traditional transaction systems; there is no attempt to force this business logic into the transactional infrastructure.
Then in July 2003, Arjuna Technologies, Fujitsu, IONA Technologies, Oracle and Sun Microsystems released the Web Services Composite Application Framework (WS-CAF)[WS-CAF], which includes a separate specification, Web Services Transaction Management (WS-TXM)[WS-TXM]. WS-TXM defines three "transaction" models, and is explicitly aimed at leveraging existing transaction infrastructures and investments and providing interoperability between them. As with WS-T, there is no attempt to force business logic into the transactional infrastructure.
Unfortunately for developers who require transaction support in their Web Services, the release of three apparently competing specifications has effectively frozen the market. Is this a case of the “not invented here syndrome” (sometimes known as “my protocol is better than yours”) or is there some fundamental reason why three different protocols have arisen? Have requirements really changed so much in the last two years?
Atomic transactions are a well-known technique for guaranteeing consistency in the presence of failures [OTS]. The ACID properties of atomic transactions (Atomicity, Consistency, Isolation, Durability) ensure that even in complex business applications consistency of state is preserved, despite concurrent accesses and failures. This is an extremely useful fault-tolerance technique, especially when multiple, possibly remote, resources are involved.
The structuring mechanisms available within traditional atomic transaction systems are sequential and concurrent composition of transactions. These mechanisms are sufficient if an application function can be represented as a single atomic transaction. Transactions are most suitably viewed as “short-lived” entities executing in a closely-coupled environment, performing stable state changes to the system; they are less well suited for structuring “long-lived” application functions (e.g., running for hours, days, …). Long-lived atomic transactions (as typically occur in business-to-business interactions) may reduce the concurrency in the system to an unacceptable level by holding on to resources (e.g., locks) for a long time; further, if such an atomic transaction rolls back, much valuable work already performed could be undone.
It has long been realized that ACID transactions by themselves are not adequate for structuring long-lived applications. To ensure ACID-ity between multiple participants, a multi-phase (typically two) consensus mechanism is required, as illustrated in Figure 1: during the first (preparation) phase, an individual participant must make durable any state changes that occurred during the scope of the atomic transaction, such that these changes can either be rolled back (undone) or committed later once consensus to the transaction outcome has been determined amongst all participants, i.e., any original state must not be lost at this point as the atomic transaction could still roll back. Assuming no failures occurred during the first phase (in which case all participants will be forced to undo their changes), in the second (commitment) phase participants may “overwrite” the original state with the state made durable during the first phase.
In order to guarantee consensus, two-phase commit is necessarily a blocking protocol: after returning the phase 1 response, each participant which returned a commit response must remain blocked until it has received the coordinator’s phase 2 message telling it what to do. Until they receive this message, any resources used by the participant are unavailable for use by other atomic transactions, since to do so may result in non-ACID behavior. If the coordinator fails before delivery of the second phase message these resources remain blocked until it recovers. In addition, if a participant fails after phase 1, but before the coordinator can deliver its final commit decision, the atomic transaction cannot be completed until the participant recovers: all participants must see both phases of the commit protocol in order to guarantee ACID semantics. There is no implied time limit between a coordinator sending the first phase message of the commit protocol and it sending the second, commit phase message; there could be seconds or hours between them.
Obviously having a blocking protocol can be a problem in general and not just in Web services. Imagine being unable to access your bank-account details because a failure of a transaction has resulted in participants being blocked. That's the reason that very early on in the development of transaction systems heuristic outcomes were introduced: participants that have been prepared are allowed to make autonomous decisions about whether to commit or rollback. For example, a participant that hasn't received the coordinator's final decision after a few hours may decide to rollback. This obviously breaks the blocking nature of two-phase commit, but introduces other problems. What if the participant's decision goes against the coordinator's? Ultimately heuristics can lead to non-atomic behavior that typically has to be resolved off-line by a system administrator. As a result, they should be used with extreme care.
Therefore, structuring certain activities from long-running atomic transactions can reduce the amount of concurrency within an application or (in the event of failures) require work to be performed again. For example, there are certain classes of application where it is known that resources acquired within an atomic transaction can be released “early”, rather than having to wait until the atomic transaction terminates; in the event of the atomic transaction rolling back, however, certain compensation activities may be necessary to restore the system to a consistent state. Such compensation activities (which may perform forward or backward recovery) will typically be application specific, may not be necessary at all, or may be more efficiently dealt with by the application.
So, if current transaction systems are not sufficient for the loosely-coupled Web services world, what systems (or specifications) are? The answer back in 2001 was that there was nothing really suitable. Because transactions are fundamental to any enterprise application, the development of a transaction specification for Web services was integral to the evolution of Web services.
In order to answer this question it is first necessary to give some (brief) background on what each specification provides and what has driven its development.
To ensure atomicity between multiple participants, BTP uses a two-phase completion protocol: during the first phase (prepare), an individual participant must make durable any state changes that occurred within the scope of the transaction, such that those changes can either be undone (cancelled) or made durable (confirmed) later once consensus has been achieved. Although BTP uses a two-phase protocol, it does not imply ACID semantics. How implementations of the prepare, confirm and cancel phases are provided is a back-end implementation decision. Issues to do with consistency and isolation of data are also back-end choices and not imposed or assumed by BTP.
Because the traditional two-phase algorithm does not impose any restrictions on the time between executing the first and second phases, BTP took the approach of using this to allow business-logic decisions to be inserted between the phases. What this means is that users have to drive the two phases explicitly in what BTP terms an open-top completion protocol. The application has complete control over when transactions prepare, and using use whatever business logic is required later determine which transactions to confirm or cancel. Prepare becomes part of the service business logic, for example.
BTP introduced two types of extended transactions, both using the open-top completion protocol:
Atom: The outcome of an atom is guaranteed to be atomic;
Cohesion: this type of transaction was introduced in order to relax atomicity. The two-phase protocol for a cohesion is parameterized to allow a user to specify precisely which participants to prepare, cancel or confirm.
The fundamental idea underpinning WS-C is that there is a generic need for a coordination infrastructure in a Web services environment. The WS-C specification defines a framework that allows different coordination protocols to be plugged-in to coordinate work between clients, services and participants.
Whatever coordination protocol is used, and in whatever domain it is deployed, the same generic requirements are present:
Instantiation (or activation) of a new coordinator for the specific coordination protocol, for a particular application instance;
Registration of participants with the coordinator;
Propagation of context;
The WS-T specification plugs into WS-C and proposes two distinct models, where each supports the semantics of a particular kind of B2B interaction:
Atomic Transaction: is similar to traditional ACID transactions and intended to support short-duration interactions where ACID semantics are appropriate.
Business Activity: is designed specifically for long-duration interactions, where exclusively locking resources is impossible or impractical. In this model services are requested to do work and may provide compensators to be executed if the business-to-business interaction aborts. How services do their work and provide compensation mechanisms is not the domain of the WS-T specification, but an implementation decision for the service provider.
WS-CAF has more in common with WS-C/T than BTP and comprises three specifications:
Web Service Context (WS-Context) which models a Web services context data structure as a Web resource, accessible via standard URLs. WS-CTX is responsible for context management and defines a way for arbitrary services to augment the context.
Web Services Coordination Framework (WS-CF) which defines a software agent called a coordinator that takes responsibility for augmenting the basic context and disseminating context information. Web services in a composite application register with coordinators to ensure message results are communicated between the coordinator and services in a reliable manner. Although the concepts defined within WS-CF overlap with those in WS-C, WS-CF goes beyond WS-C and defines an optional interaction pattern for developers to use when building coordination engines.
WS-TXM, which defines three distinct transaction protocols that can be plugged into the coordination framework for interoperability across existing transaction managers, long running compensations, and business process automation (aimed to allowing islands of different transaction domains to participate in the same global distributed transaction). This is a “live” document, in that the authors intend for other transaction models to be added to it when the need arises.
The WS-TXM specification defines the following models:
ACID transaction: a traditional ACID transaction (AT) designed for interoperability across existing transaction infrastructures.
Long running action: an activity, or group of activities, which does not necessarily possess the guaranteed ACID properties. A long running action (LRA) still has the “all or nothing” atomic effect, i.e., failure should not result in partial work. Participants within an LRA may use forward (compensation) or backward error recovery to ensure atomicity. Isolation is also considered a back-end implementation responsibility.
Business process transaction: an activity, or group of activities, that is responsible for performing some application specific work. A business process (BP) may be structured as a collection of atomic transactions or long running actions depending upon the application requirements
As you can see, WS-CAF has some similarities with WS-C and WS-T. This shouldn't come as a surprise if you realize that the authors of both sets of specifications have collaborated on extended transaction protocols before in the Object Management Group (OMG)[XOTS]. Although there are technical differences between the specifications, some of the reasons for the appearance of WS-C/T and WS-CAF are without a doubt political. Obviously this doesn't help end-users, but it does give some insight into the current "standardization process" for Web services!
Although at first glance it may seem like there is commonality between the specifications (all of them support a two-phase completion protocol, for example), there are significant differences from both a protocol and implementation perspective.
As you might expect from a specification that took over a year to develop, on the plus side the BTP specification is well formed and complete. Unfortunately, although the protocol is not complex to understand, the specification is nearly 200 pages! It is thus not an easy sell for customers or analysts (and sometimes implementers).
What does it mean to be a user of a Web services transaction? Initially it may seem like a good idea to let business logic directly affect the flow of a transaction from within the “commit” protocol, but in practice it doesn’t really work: it blurs the distinction between what you would expect from a transaction protocol (guarantees of consistency, isolation etc.) which are essentially non-functional aspects of a business “transaction”, with the functional aspects (reserve my flight, book me a taxi, etc.) In BTP, because business logic is encoded within the transaction protocol, it essentially means that a user had to be closely tied to the (or perhaps even be a) coordinator! Business information, such as the ability for a participant to remain “prepared” (e.g., hold onto a hotel room) for a specific period of time is propagated from the participant to the coordinator, but there is nothing within the protocol to allow this information to filter up to the application/client where it really belongs!
In fact, in order to use cohesions it is also necessary for Web services to expose back end implementation choices about participants: in order to parameterize the two-phase completion protocol, the terminator of the cohesion obviously needs to be able to say “prepare A and B and cancel C and D”, where A, B, C and D are participants that have been enrolled in the cohesion by services (such as a flight reservation system). In a traditional transaction system users don’t see the participants (imagine if you had to explicitly tell all of your database resource managers to prepare and commit?) Naturally this is something that programmers don’t feel comfortable with and it goes against the Web services orthodoxy. Because BTP requires transaction control to use the “open top” approach, it is difficult to leverage existing enterprise transaction implementations.
Furthermore, the BTP specification expends great efforts to ensure that two-phase completion does not imply ACID semantics. This is good in so far that traditional ACID transactions are not suitable for all types of Web services interactions. However, everything is left up to back-end implementation choices and there is nothing in the protocol (implicit or explicit) to allow a user to determine what choices have been made. Therefore, it is impossible to reason about the ultimate correctness of a distributed application. For example, if you wanted to use BTP for ACID transactions, then of course services could use traditional database resource managers (for example) wrapper by BTP participants. Unfortunately, there is no way within the BTP for those services to inform external users that this is what they have done so that they can safely be used within the scope of a BTP “ACID” transaction.
Both the WS-C and WS-T specifications are smaller than BTP, at about 45 pages in total. It is apparent from the specifications that simplicity and interoperability with existing transaction infrastructures played a key role in their development. Unfortunately they are also incomplete and have several protocol errors. For example, although heuristic outcomes are inevitable in distributed transactions, no support is provided in WS-T. Likewise, distributed recovery is paid very little attention. However, these are all issues that subsequent revisions can obviously address.
On the plus side, the separation of coordination from transactions is good: coordination is a more fundamental requirement and a separate framework offers the chance for a cleaner separation of concerns [HP submission to BTP] Because WS-C does not imply transactionality or a specific protocol implementation, it can therefore be used in more places than other protocols that have use of coordination but are tied to transactions (such as BTP).
The fact that WS-T Atomic Transactions are meant specifically for closely-coupled interactions with ACID semantics makes integration with back-end infrastructures easier. Web Services are for interoperability as much as for the Internet. As such, interoperability of existing transaction processing systems will be an important part of Web Services transactions: such systems already form the backbone of enterprise level applications and will continue to do so for the Web services equivalent. Business-to-business activities will involve back-end transaction processing systems either directly or indirectly and being able to tie together these environments will be the key to the successful take-up of Web Services transactions. It also takes away any ambiguity from users and services: they know a priori what semantics to expect.
In the realm of “extended transactions”, the WS-T Business Activity also plays very well. It gives service developers complete freedom to define compensation mechanisms that best suit their services (for example, using Atomic Transactions where necessary), whilst at the same time providing a simple model for the users of these services. In addition, it ties in well with Web services choreography techniques.
WS-TXM is over 100 pages, compared to WS-T. It is obvious, however, that more time and attention has been paid by the authors to implementation and interoperability details. For example, heuristics are supported in WS-TXM as is distributed recovery.
Ignoring the WS-Context and WS-CF specifications, the transaction protocols defined by WS-TXM sit strongly with existing infrastructure investments. Atomic Transaction can be used as a bridge between proprietary transaction service implementations such as MSDTC [Microsoft DTC] and implementations based on standards such as [OTS], something which until recently has been extremely difficult to accomplish. As with WS-T, this model sits clearly in a single domain, that of ACID transactions - it is unsuitable for long duration interactions because the two-phase protocol is blocking.
The two extended transaction models provided by WS-TXM cover different aspects of business-to-business interactions. The LRA model allows the creation of (nested) scopes, where each scope performs work in whatever manner is appropriate to the business logic. Compensators for that work are registered with the scope so that if it were to cancel, the work can be undone. There are some similarities with the Business Activity model in WS-T; however, the LRA model allows arbitrary nesting of scopes and hence compensations.
In the BP model all parties involved in a business process reside within business domains, which may themselves use business processes to perform work. Business process transactions are responsible for managing interactions between these domains. A business process (business-to-business interaction) is split into business tasks and each task executes within a specific business domain. A business domain may itself be subdivided into other business domains (business processes) in a recursive manner.
Each domain may represent a different transaction model if such a federation of models is more appropriate to the activity. Each business task (which may be modeled as a scope) may provide implementation specific counter-effects in the event that the enclosing scope must cancel. In addition, periodically the controlling application may request that all of the business domains checkpoint their state such that they can either be consistently rolled back to that checkpoint by the application, or restarted from the checkpoint in the event of a failure.
An individual task may require multiple services to work. Each task is assumed to be a compensatable unit of work. However, as with the LRA model described earlier, how compensation is provided is an implementation choice for the task.
For example, consider the purchasing of a home entertainment system example shown in Figure 2. The on-line shop interacts with its specific suppliers, each of which resides in its own business domain. The work necessary to obtain each component is modeled as a separate task, or Web service. In this example, the HiFi task is actually composed of two sub-tasks.
In this example, the user may interact synchronously with the shop to build up the entertainment system. Alternatively, the user may submit an order (possibly with a list of alternate requirements) to the shop which will eventually call back when it has been filled; likewise, the shop then submits orders to each supplier, requiring them to call back when each component is available (or is known to be unavailable).
Two years ago the world of Web services and transactions looked like a new frontier, requiring new techniques to address the problems that it presented. BTP was seen as the solution to those problems. Unfortunately, with the benefit of hindsight it did not address what users really want: the ability to use existing enterprise infrastructures and applications and for “Web services transactions” to operate as the glue between different corporate domains. And it had better be simple to use and understand as well!
The BTP model is similar to WS-T/WS-TXM in several respects, but crucially it does not address the issues of transaction interoperability: most enterprise transaction systems do not expose their coordinators through the two-phase protocol. In addition, BTP has many subtle (and some not-so-subtle) impacts on implementations, both at the transaction level but more importantly at the user/service level.
With no exceptions all of the major Web services players who originally participated in the OASIS BTP effort have moved on to either WS-T or WS-CAF. So, although at the time of writing BTP is the only specification at a recognized standards body (though not an adopted specification), it is unlikely to play a major role in the future.
The real issue is between WS-T and WS-TXM. There are technical differences between the various models that both specifications define, but there is nothing that would prevent them from being merged into a single specification. Interestingly the WS-CAF specifications have been designed to allow WS-C or WS-T to plug into them and replace/augment the WS-CF or WS-TXM specifications. As we indicated earlier the main reason for two different specifications appears to be political. However, there is another important difference: the WS-C/T specifications are not available royalty free or in a standards process, whereas the WS-CAF specifications are royalty free and currently progressing through a standards body.
While much of this may be hidden to end users (except for possible added cost), there is a large community of businesses and users that actually use these specifications and standards directly to build their products and services. Particularly with Web services, which is envisioned as enabling a large and rapidly growing marketplace/ecosystem of services, it is critical to prevent the foundational standards to be threatened by IP that is not publicized.
Even end-user customers and businesses who don’t build the services themselves are concerned with this when they consider the long-term impact of having a key foundation for their business controlled by a couple of big players. Open and royalty-free standards provide predictability, interoperability and reduced costs. How many of today’s successful e-businesses would be content to discover that a company who once had engineers contribute, for example, to the TCP/IP or HTTP specification efforts, now claims intellectual property rights over their contributions, and demands royalties for use of those standards?
Hopefully the authors of these two specifications will be able to co-operate to consolidate them into a single specification and ultimately standard. However, there are strong polarization forces at work here:
Microsoft is not know for actively participating in standards bodies. Their normal modus operandi is to create product, factor out of that product a "specification" (possible working with others in a closed forum) and then expect a standards body to rubber-stamp that specification into a standard.
Sun Microsystems has been pushing open-standards efforts more and more over the latter few years. It often seems that the fact that something is in a standards process overrides the technology.
So where does this leave the humble end-user? At the moment there are really only two transactions specifications to watch out for: WS-T and WS-TXM. As we've seen, OASIS BTP was a valiant effort, but suffered from naivety and lack of direction. Whether WS-T or WS-TXM (via WS-CAF) become the predominante standard is unknown at this time. Unless Microsoft and IBM are willing to work through an open-standards process and compete in the marketplace on implementation rather than specification, it is unlikely that there will ever be a single specification for Web services transactions.
Looking back over the brief history of Web services transactions. does this mean that the answer to Web services transactions is what we have had for the past 20+ years but using XML and SOAP? Yes! In the real world, it is unreasonable (and naive) to assume that people can (or will want to) throw away their corporate investments in infrastructure, training etc. The investment in transaction processing systems over the past few decades has cost $billions and any scheme to leverage that investment rather than replace it is the way forward.
Much has been made of the fact that ACID transactions aren’t suitable for loosely coupled environments like the Web. However, very little attention has been paid to the fact that these loosely coupled environments tend to have large strongly coupled corporate infrastructures behind them! When BTP started, the question should not have been “what can replace ACID transactions?”, but rather “how can we leverage what already exists?”
[BTP] BTP Committee specification, April 2002 http://www.oasis-open.org/committees/business-transactions
[WS-C] Web Services Coordination specification, August 2002 http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnglobspec/html/ws-coordination.asp
[WS-T] Web Services Transactions specification, August 2002 http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnglobspec/html/ws-transaction.asp
[WS-CAF] Web Services Composite Application Framework announcement, August 2003 http://www.arjuna.com/standards/ws-caf/index.html
[WS-TXM] Web Services Transaction Management specification, August 2003 http://www.arjuna.com/library/specs/ws_caf_1-0/WS-TXM.pdf
[OTS] The Object Transaction Service specification, 2001 from the OMG http://www.omg.org/cgi-bin/apps/do_doc?formal/02-08-07.pdf
[XOTS] OMG Additional Structuring Mechanisms for the OTS specification http://www.omg.org/cgi-bin/apps/do_doc?formal/02-09-03.pdf
[HP submission to BTP] A Framework for Implementing Business Transactions on the Web, Hewlett-Packard initial submission to BTP, March 2001 http://www.oasis-open.org/committees/business-transactions/
[Microsoft DTC] Microsoft Distributed Transaction Coordinator http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mts/transactions_74kj.asp
Atomic, Consistent, Isolated, Durbale
business process
Business Transaction Protocol
long running action
Object Management Group
Web Services Coordination
Web Services Composite Application Framework
Web Services Coordination Framework
Web Service Context
Web Services Transactions
Web Services Transaction Management
![]() ![]() |
Design & Development by deepX Ltd. |