XML Europe 2002 logo

Ruple: an XML Space Implementation

Abstract

The abstract was not available at the time the proceedings were created. Please check an updated version of the paper abstracts at the conference proceedings web site.


Table of Contents

1. Beyond Web Services
2. The Ruple Forum
3. Foundations of the Forum
3.1. Internet Protocols
3.2. XML
3.3. Tuple Spaces
3.4. XML Tuple Spaces
3.5. Leases
3.6. Document Targeting
3.7. MIME Attachments
4. Design Principles
5. The Forum Application Interface
5.1. Write Example
5.2. Read Example
6. Forum Clients
7. Ruple Applications
7.1. B2B Applications
7.2. Mobile Applications
7.3. Remote Telemetry
7.4. XML Document Routing
7.5. Secure Document Exchange
7.6. Intra/Inter Workflow and Business Process Automation
7.7. Transformation Buffer
7.8. Web Services Intermediary
8. Conclusion
Biography

1. Beyond Web Services

Rogue Wave Software recently introduced a new communication infrastructure that builds upon Web Services to provide for secure, loosely coupled, many-to-many, and document-centric communications over the Internet. It is called Ruple.

Ruple is an Internet shared memory space. It is based on concepts originally presented in Linda, a tuple spaces language developed at Yale University. Other implementations of tuple spaces include Sun's JavaSpace and IBM's TSpace. Rogue Wave's Ruple is unique in that it stores XML documents rather than Java objects and that it is accessible over the Internet using standard Internet protocols such as HTTP and SOAP. It also offers a security model based on X.509 digital certificates. Applications can place documents in a Ruple Space and then retrieve them using an XML query expression.

A space can be located anywhere on the Internet, for example at "space.roguewave.com" or "spacely.sprockets.com", so spaces are easy to find using standard Internet protocols such as DNS. A space is also accessible as a Web Service.

Spaces offer a flexible alternative for applications to communicate with each other, in contrast to the present connection-oriented technologies like CORBA or DCOM. These RPC-based approaches require that both applications be online and connected to be used; such a connection is analogous to a telephone call. Spaces do not have this constraint and are more analogous to an e-mail. You can send a message to one or more recipients asynchronously without them actually being online. An application might put a document in a space to be retrieved at a later time by one or more requesting applications, including perhaps an occasionally connected device, such as a PDA or cell phone.

A key feature of Ruple is an utterly simple programming model of only four methods and an equally simple state model. All entries are self-contained. Together these two properties make it extremely simple to replicate a Ruple Space, making it straight-forward to offer redundancy and high availability. While the Internet is a remarkably robust infrastructure, it also makes very few guarantees in the way of security and delivery. Ruple answers the question of how to build flexible, reliable systems on an infrastructure that makes such few promises.

Because of its ability to build asynchronous, loosely-coupled, multiway, applications, Ruple revitalizes distributed computing in several ways: by providing support for intermittently connected devices, making it possible to implement complex collaborative applications in a simple manner , and by providing extreme loose-coupling. The many possible uses for Ruple include workflow applications, simplified firewalls to support SOAP, transformation engines, portals, B2B applications, document routing and many others.

2. The Ruple Forum

The basis of Ruple is an XML Tuple Space implementation called the Forum. The Forum provides a foundation for applications that collaborate transparently and independently by selectively and securely exchanging a set of business documents. Using the Forum an application may operate on existing documents or wait for documents that meet its requirements to appear in the Forum. The Forum presents a simple abstraction and a set of guarantees to each application that uses it. The resulting applications can be de-coupled in several dimensions: whom they collaborate with, where their collaborators are, and when the collaboration takes place.

For the Forum to support this loose coupling of applications, the documents that applications operate on have to be self-describing, subject to associative query, and viewable in different environments. Documents encoded in the Extensible Markup Language (XML) meet these requirements. At the same time, the protocol used internally to communicate with the Forum must be loosely coupled and compatible with existing Internet protocols and services. Ruple uses the Simple Object Access Protocol (SOAP) to meet these requirements. SOAP requests and responses are encoded as XML and delivered over HTTP.

Applications read and write XML documents to the Forum through language APIs, browser scripting APIs, or directly via SOAP. Applications communicate directly with the Forum and indirectly, through the medium of the Forum, with other participants.

click image for full size view

Documents written to the Forum may be targeted towards one or many recipients. The recipient of a document does not have to be connected to the Forum when the document is written as the Forum provides a short term persistent message store.

All access to the Forum is via HTTP. A security manager strictly controls which users can access the Forum and what documents each user can search through and retrieve. A user who presents a trusted X.509 digital certificate when accessing the Forum over a secure HTTP connection is able to access any documents that they have been authorized to see. Additionally, authorized users can add new documents of any type to the Forum at any time. Once in the forum, documents can be located and retrieved by authorized users with an associative query. A document resides in the Forum until (a) it is removed by an authorized user, or (b) a negotiated expiration date is reached, at which time the Forum discards it and it is no longer accessible.

3. Foundations of the Forum

The Forum combines existing and widely understood technologies and infrastructure in a unique manner, creating a powerful yet elegant solution to the problem of open application communication on the globally deployed Internet.

3.1. Internet Protocols

The Forum uses existing Internet technology and infrastructure to create a globally accessible, highly secure, robust, scalable and persistent shared memory containing XML documents.

Forums are found using DNS. This provides a level of indirection insulating the caller from the exact location of the space. A DNS name could be something like "exchange.spacelysprockets.com" or "mobilespace.myorganization.com"

The interface to the Forum is SOAP over HTTP. These widely understood and deployed protocols permit support from a great variety of programming languages and platforms.

The Forum could be deployed on the Internet, an extranet, or an intranet. A natural place to deploy a Forum is in the "demilitarization zone" or DMZ of an organization, between two firewalls. In this configuration the Forum provides a document exchange gateway between the organization and the outside world.

3.2. XML

XML (eXtensible Markup Language) is the ideal encoding mechanism for sharing information and functionality between disparate systems. The syntax can be easily read, understood and shared by developers, and a growing number of editing and translation tools make XML easy to use. Because XML is essentially text, many existing software technologies handle it with ease. Businesses can exchange XML-encoded data internally and externally over the Internet to remote enterprise systems and even to wireless devices, as long as they've established compatible XML vocabularies.

3.3. Tuple Spaces

Tuple spaces were created at Yale University in the mid-1980's as part of the Linda coordination language, where they were used to coordinate interactions within parallel applications. More recent implementations of tuple-spaces include Sun's JavaSpaces and IBM's Tspaces--both Java-centric tuple space implementations.

A tuple space is a shared dataspace through which independently active processes communicate. This space can be thought of as a shared message store between cooperating processes. Messages (tuples) are written to the space by the sender and discovered by receivers using an associative lookup.

Tuple spaces provide for loosely coupled communication, decoupling senders and receivers in space and time. Decoupled in space, since senders and receivers can exist anywhere in the distributed system and do not have to know each other's location. Decoupled in time, since a receiver does not have to exist when the message is sent; messages are stored until the recipients are ready to receive them.

In contrast, tightly coupled communications rely on a rendezvous between a sender and a receiver. The sender must know the receiver's address and both the sender and the receiver must be available and ready to communicate simultaneously. Thus, where traditional systems require synchronous point-to-point links, tuple spaces offer asynchronous anonymity.

Tuple spaces provide a sort of multicast medium in through which multiple receivers can read a tuple written by a single sender. Point-to-point communications, on the other hand, require the sender to initiate a connection with each recipient. This multicast-like quality of tuple spaces allows for ad hoc and scalable collaboration where new participants can join a collaboration without the sender having to "discover" or connect to the participant.

The interface to tuple spaces is very simple. It only requires a handful of operations to write, read, and remove take tuples. Applications communicate with each other through the space, so senders and receivers don't need to know the protocols or the interfaces of each other, only that of the space.

Some tuple space implementations, including Ruple, provide a way for tuples to expire, to be deleted once their designated lifetime is gone. These tuples have a "shelf life", determined by a lease associated with the space entry.

3.4. XML Tuple Spaces

The Forum is an XML Tuple Space implementation. The space stores XML documents. XML documents are written to the space and found by readers using and XML query syntax (XQL). The space supports any well formed XML document or fragment.

The Forum is a document-centric tuple space implementation. This is an important distinction.

Ruple uses a document-centered approach to tuple spaces. Traditional tuple-space implementations have been closely tied to a programming language or language specific object model.. The object-centric approach has limited utility on the Internet and in heterogenous environments as it requires an object model and, in all practicality, ties you to a particular programming language and binary protocol. In contrast, Ruple's document-centric implementation of tuple spaces is independent of any particular programming language. It gives you the choice of processing the document (or not) with the tools at your disposal-depending on your language, environment, or preference.

The Forum supports any well formed XML document or document fragment. Well formed XML must be syntactically correct according to the XML 1.0 specification. The Forum does not require that documents be valid per any particular XML schema. This allows for the exchange of arbitrary documents through the space without having to configure the space for a new document type, allowing for ad hoc and flexible interactions. If document validity is important to the participants in a transaction then the collaborators may validate the document when they receive it.

Documents are discovered in the XML tuple space using an XML query syntax. The query is specified on a read or take operation. The Forum supports a subset of the XQL query language for XML documents. XQL provides a notation for addressing and filtering the elements and text of XML documents. As used with the Forum, a query specifies criteria for selecting an XML document as the target of a read(), readMultiple() or take() operation. In other words, if the elements in a document satisfy all the criteria in a query, we consider that the document satisfies the query. The document or documents (in the case of readMultiple) satisfying the query will be returned to the caller.

Sample queries:

/bookstore/book/title$eq$"Seven Years in Trenton"

//author/first-name$eq$"Toni"

//excerpt/p[. contains "dark" $and$ . contains " stormy"]

3.5. Leases

Documents written to the Forum have a "shelf-life". A lease specified when the document is written determines how long the document will be available in the space. If the document is still in the space when the lease expires then the document is removed by the Forum Reaper. By storing documents temporarily, the space allows for asynchronous and 1-to-many interactions.

3.6. Document Targeting

Ruple uses X.509 digital certificates to identify the recipient of a document. When a document is written into the space, the writer may allow all readers or they may prescribe a limited set of users based on their digital certificate identity. If a user does not have permission to read a document, they would never see. It would be as if it did not exist.

In addition, the digital certificate serves as a mechanism for targeting the document towards the recipient based on their identity, rather than a network address--further decoupling sender and receiver.

3.7. MIME Attachments

XML documents are great for describing textual information; however, complications arise when content needs to be attached, such as a picture or an Excel spreadsheet. Ruple uses MIME attachments to accomplish such tasks. An attachment is content that can be associated and stored with a Forum entry or document. The content might consist of a file, a persisted object, another XML document or perhaps an image. This content is represented by an attachment, which can be retrieved, along with the XML document, from the Forum by a read, readMultiple, or take operation. When the host XML document is taken or deleted, all of its attachments also are taken or deleted.

4. Design Principles

Ruple offers a very simple programming model (only four basic methods) and a very simple state model (all entries to the Forum are self-contained). Ruple adheres to the following properties:

  1. Simplicity - Simplicity in design and implementation yields robust and maintainable code. Complexity breeds the opposite--brittle and hard to maintain systems. Complexity in infrastructure is especially insidious, as it corrupts all that builds upon it. Simplicity is paramount in the Ruple architecture and usage model.

  2. Decoupled communications - Time and space decoupling is essential on the unreliable Internet and for intermittently connected devices. Connections and devices come and go. Not presuming reliable communications and stable endpoints allows us to create more robust and flexible distributed applications. By eliminating the requirement to name a recipient by address, we enable applications that span multiple devices and dynamically addressed devices; the data is targeted towards an individual or organization, not a device address.

  3. Multi-party Communications - Collaborative applications involving intra- and inter-organizational constituents require many-to-many, and in some cases ad-hoc patterns of interaction. Participants may need to communicate to a broad spectrum of recipients, sometimes without knowing exactly who those recipients are. At other times communications must be very securely targeted towards a particular party.

  4. Document Centric - Ruple follows the document exchange pattern of the Internet by enabling tuple space exchanges of XML documents. This is a natural mode of interaction for Internet applications. It is also a natural way for humans and organizations to interact, allowing us to model our applications closely to the problem domain..

  5. Internet Native - Ruple is built on common Internet infrastructure. It does not require special protocols, communication infrastructure, or document formats.

  6. Simple Robust Security - Applications that collaborate over the Internet must pay special attention to security. Ruple provides per document authorizations. Rights can be granted separately between reads and takes.

  7. Scalability - The Ruple Forum is a stateless server where documents exchanged through the space are independent of each other. This allows the implementation of the Ruple Forum to be highly scalable and failure resistant.

5. The Forum Application Interface

Applications use the Forum through a programming interface, and may be implemented within a browser, a servlet, or an application. Documents are accessed and modified using a simple set of atomic verbs. Verbs that access existing documents in the Forum require a query to be specified using a subset of XQL.

The four basic verbs that operate on the Forum are:

  • write - write a document specifying an authorizations list and a proposed expiration date

  • read - read a document that matches an XQL query

  • readMultiple - read all documents that match an XQL query up to a specified maximum

  • take - read and remove a document that matches an XQL query. Take is similar to read except that only one application can successfully take an entry whereas multiple applications may successfully read an entry

The programming interface includes a means to establish access to the Forum in addition to operations to manipulate authorization lists.

5.1. Write Example

 import com.roguewave.forum.*;

/**
 * Write an xml document to the forum
 */
public class W
{
    public static void main(String[] args) throws Exception {

        // Connect to the forum:
        String m_url = "http://forum.roguewave.com/forum/servlet/ForumServlet";
        Forum f = ForumFactory.createForum(m_url);

        // Set security policy:
        Authorizations auths = new AuthorizationsImpl();
        auths.allowAllReaders(true);

        // Write a document to the forum:
        String xmlDoc = "<color>purple</color>";
        int myLease = 604800; // 1 week
        int lease = f.write(xmlDoc, myLease, auths);
    }
}
	

5.2. Read Example

import com.roguewave.forum.*;

/**
 * Read an xml document from the forum
 */
public class R
{
    public static void main(String[] args) throws Exception {

        // Connect to the forum:
        String m_url = "http://forum.roguewave.com/forum/servlet/ForumServlet";
        Forum f = ForumFactory.createForum(m_url);

        // Query for doc. containing "purple" as value of <color> tag:
        String query = "/color$eq$\"purple\"";  
        Entry entry = f.read(query, 0);

        // Print document, if successfully retrieved:
        if (entry != null) {
            String doc = entry.getDocument();
            System.out.println(doc);
        }
        else {
            System.out.println("Document not found");
        }
    }
}

6. Forum Clients

The Forum's programming interface may be accessed directly using either SOAP, a Java API, or a Java Applet. The SOAP interface is a collection of request and response messages delivered to the Forum over HTTP. Any application capable of connecting to the Forum over HTTP and generating SOAP requests can use this interface. Proxies in other languages such as C#, C++, Python, ASP.NET and others could easily be provided.

7. Ruple Applications

Ruple is a general XML document exchange technology. It can be used to create a wide variety of collaborative applications across the Internet or within an organization. Possibilities include B2B applications, secure document exchange, workflow (internal/external), business process automation, multi-way collaborative applications, mobile workforce applications, XML document routing, portals, intermittently connected applications, military, multicast, e-procurement, bid/ask marketplaces, EDI, etc.

7.1. B2B Applications

A reverse auction is a typical B2B example that benefits from a many-to-many, loosely coupled communication paradigm. A business looking for a product or service writes an RFQ (Request for Quotes) into the space.The RFQ is targeted toward all takers, or perhaps to a constrained list of vendors. Vendors meanwhile wait for RFQs to come into the space and, when a suitable RFQ arrives, they write a Quote into the space, targeting the buyer. The buyer takes the quotes from the space and then responds to a particular vendor with a purchase order.

click image for full size view

7.2. Mobile Applications

Document-centric tuple spaces, with their ability to temporally decouple a sender from a receiver, are ideal for implementing applications that involve intermittently connected devices such as PDAs, cell phones, and Communicators. In this example, we see a PDA, located perhaps in a warehouse, writing inventory data to a space from which it can be retrieved by a back office inventory application.

click image for full size view

7.3. Remote Telemetry

Remote telemetry applications using the document space model allows a control application to "broadcast" calibration information to a class of devices in the field and to reap instrument readings from those devices as they are periodically uploaded from the field stations.

click image for full size view

7.4. XML Document Routing

The Forum can be used within an organization as a routing mechanism for XML documents coming into the organization, as a document way station for portal applications, or as an internal routing mechanism.

click image for full size view

.

7.5. Secure Document Exchange

The secure aspects of the space can be used to securely distribute sensitive information. Ruple uses X.509 digital security certificates and HTTPS to ensure secure access and authentication.

click image for full size view

7.6. Intra/Inter Workflow and Business Process Automation

Ruple can be used as the basis for business process automation and workflow applications coordinating the creation and exchange of documents between cooperating parties. Collaborative activities may be constrained to a single organization or they may span divisions and separate companies. within a company as well as workflows that span divisions and separate companies.

click image for full size view

7.7. Transformation Buffer

The Ruple Space can be used as a transformation buffer. Incoming documents are identified and transformed by transformer who writes the transformed document back into the space, or into another space, to be found by the intended application.

click image for full size view

7.8. Web Services Intermediary

Ruple may be used as the basis of a web services intermediary to provide security and ease of deployment of web service applications.

click image for full size view

8. Conclusion

Until now we haven't had the tools to write robust, loosely coupled, collaborative multiway applications over the Internet. The tools at our disposal are artifacts from the days when we could depend on reliable network infrastructure, devices that didn't move, and closed networks. While attempts have been made to adapt LAN technologies, such as RPCs, to the Internet, these attempts are ill-suited to the task. We need new tools that are tailored to the specific needs of the Internet . A model like the Ruple Forum is one that has a chance of becoming the pervasive infrastructure for distributed applications between organizations on the Internet

© 2002 Copyright Rogue Wave Software, Inc. All Rights Reserved. Rogue Wave is a registered trademark of Rogue Wave Software, Inc. All other trademarks are the property of their respective owners.

Biography

Patrick is Chief Architect and Director of Architecture and Research at Rogue Wave Software.