Keywords: Authoring, Content Management, Custom Publishing, Electronic Publishing, HTTP binding, Internet, XML, Publishing, Web Services, SOAP, Knowledge Management
Biography
Author of the RESTLog API and editor of the Atom Publishing Protocol.
The AtomAPI is an emerging interface for editing content. The interface is RESTful and uses XML and HTTP to define an editing scheme that's easy to implement and extend. History, basic operation, and applications to areas outside weblogs will be covered.
1. Preface
2. What's the point
2.1 “Just” use WebDAV
3. History
3.1 LiveJournal API
3.2 XML-RPC
3.2.1 Examples
3.2.2 Example
3.2.3 Limitations
4. Basic Operation
4.1 Concepts
4.2 Overview
4.3 Example
4.3.1 Create an entry
4.3.2 Update an entry
4.3.2.1 GET
5. Optimizations
5.1 Concurrent Updates
6. Applications Outside Weblogs
Bibliography
At the time I submitted to present this material at XML-2004 the name used was the AtomAPI. Since then much progress has been made in the Atom Working Group and several Internet-Drafts have been published. One of the biggest changes has been that the AtomAPI is now referred to as the Atom Publishing Protocol. I will make every effort to stick to the new name throughout this paper, but please keep in mind that AtomAPI was in use for over a year and both names appear in the online resources for Atom.
What are trying to accomplish? Publishing content to the web, primarily weblogs. That is, the creation of chronologically ordered web content authored usually by one person.
What are not trying to accomplish? We are not trying to re-invent WebDAV. WebDAV (Web Distributed Authoring and Versioning) is an HTTP based protocol that uses XML and is used for editing all kinds of web content. Painting with a broad brush the goals on WebDAV and the Atom Publishing Protocol are the same, this is from the webdav.org.
The stated goal of the WebDAV working group is (from the charter) to "define the HTTP extensions necessary to enable distributed web authoring tools to be broadly interoperable, while supporting user needs", and in this respect DAV is completing the original vision of the Web as a writeable, collaborative medium.
The differences become greater when you look at the major features of WebDAV, again quoting from webdav.org[WebDavFaq].
The Atom Publishing Protocol [AtomPubProt] provides no locking, no namespace manipulation, and while it does support resource metadata, it does not provide searches based on that metadata. Note that in this context the WebDAV documentation use of the term namespaces does not mean XML namespaces but instead server URI namespace.
Almost all of the weblog publishing protocols preceeding the Atom Publishing Protocol are RPC (Remote Procedure Call) interfaces. The idea is similar to CORBA or DCOM in that you are performing a function call remotely, the difference is that the protocols in this case aren't binary but text.
One of first weblog publishing protocols, the LiveJournal protocol formatted all the requests as ' application/x-www-form-urlencoded' and did not use XML. Key aspects:
As the name suggests this is XML pushed over HTTP to implement an RPC interface.
There are quite a few XML-RPC based protocols for editing web content. They, for the most part, do not interoperate with each other. The one exception is the Metaweblog API that does not natively support a call to delete a weblog entry but instead imports the use of a single function from the Blogger API.
Here is an example of a call to create a new entry using the Blogger API:
POST /api/RPC2 HTTP/1.0
User-Agent: Java.Net Wa-Wa 2.0
Host: plant.blogger.com
Content-Type: text/xml
Content-length: 515
<?xml version="1.0"?>
<methodCall>
<methodName>blogger.newPost</methodName>
<params>
<param><value><string>C6C.....19BF4E294</string></value></param>
<param><value><string>744145</string></value></param>
<param><value><string>ewilliams</string></value></param>
<param><value><string>secret</string></value></param>
<param><value><string>Today I had a peanut butter and pickle
sandwich for lunch. Do you like peanut-butter
and pickle sandwiches? I do. They're yummy. Please
comment!</string></value></param>
<param><value><boolean>false</boolean></value></param>
</params>
</methodCall>
|
And the response if the request succeeded:
HTTP/1.1 200 OK
Connection: close
Content-Length: 125
Content-Type: text/xml
Date: Mon, 6 Aug 20001 19:55:08 GMT
Server: Java.Net Wa-Wa/Linux
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value><string>4515151</string></value>
</param>
</params>
</methodResponse> |
The interfaces above all suffer from one or more of the following limitations:
So these protocols take XML and try to use it in a protocol and remove two of the most important qualities that XML brings to the table; I18N and extensibility. In addition they use HTTP but ignore the built in authentication mechanism and the ability of servers to quickly dispatch requests based on the headers of a message and not just the contents.
The Atom Publishing Protocol seeks to overcome the limitations of XML-RPC based protocols and to better leverage XML and HTTP.
What are the basic operations when editing a weblog? Creating an 'entry' is one operation. After an 'entry' has been created then the user may wish to edit the entry or possible delete the entry.
There are three types of URIs used in the Atom Publishing Protocol. Note that this is part of fully using HTTP, that each resource we are interacting with gets it's own URI. Each of these types of URIs has defined behaviours for specified HTTP methods. Again this is part of fully using HTTP, not only having different URIs for different resources but also using different HTTP methods for the different kinds of actions we wish to perform on a resource. For this paper we'll restrict our attention to just two of those URIs, PostURI and EditURI. These are the main URI types used in editing an entry, there is a single PostURI for a site that is used to create new entries. For each entry there is a distinct EditURI that is used to edit that particular entry.
URI | Method | Description | ||||
PostURI | POST | The body of the POST is an Atom entry that is used to create a new entry. | ||||
EditURI | GET | Returns an Atom representation of the entry. | PUT | Updates the entry using the updated Atom entry in the body of the PUT. | DELETE | Removes the entry. |
To create an entry for a site you must POST an Atom entry to the PostURI. For example, if the PostURI were http://example.org/reilly then the following would be sent to port 80 of the server example.org:
POST /reilly HTTP/1.1
Host: example.org
Content-Type: application/atom+xml
<?xml version="1.0" encoding='utf-8'?>
<entry
version="draft-ietf-atompub-format-01: do not deploy"
xmlns="http://purl.org/atom/ns#draft-ietf-atompub-format-01" >
<title>My First Entry</title>
<summary>A very boring entry...</summary>
<author>
<name>Bob B. Bobbington</name>
</author>
<created>2003-02-05T12:29:29</created>
<content type="application/xhtml+xml" xml:lang="en-us">
<p xmlns="...">Hello, <em>weblog</em> world!</p>
</content>
</entry> |
and a successful POST might receive the following response:
HTTP/1.1 201 Created
Location: http://example.org/reilly/1
<?xml version="1.0" encoding='utf-8'?>
<entry
version="draft-ietf-atompub-format-01: do not deploy"
xmlns="http://purl.org/atom/ns#draft-ietf-atompub-format-01" >
<title>My First Entry</title>
<summary>A very boring entry...</summary>
<author>
<name>Bob B. Bobbington</name>
</author>
<issued>2003-02-05T12:29:29</issued>
<created>2003-02-05T14:10:58Z</created>
<modified>2003-02-05T14:10:58Z</modified>
<link rel="alternate" type="text/html">
http://example.org/reilly/2003/02/05#My_First_Entry
</link>
<id>http://example.org/reilly/1</id>
<content type="application/xhtml+xml" xml:lang="en-us">
<p xmlns="...">Hello, <em>weblog</em> world!</p>
</content>
</entry> |
There are several important aspects to note about this transaction. First note that we are just POSTing an Atom entry. This is important since we are re-using the Atom Syndication Format and not creating a different XML format for the Protocol versus the Syndication format[AtomPubFormat]. This also gets use away from the silly ASCII only restrictions of XML-RPC. Secondly the response include a Location: header which specifies the URI at which the new entry was created. In this case the URI is of an Atom entry and not the URI of the HTML resource created. Also, the filled in Atom entry was returned in the response. Additional things to notice that come from the use of the Atom Syndication Format are the use of xml:base and xml:lang.
Once an entry is created we may want to edit it. The first step is to retrieve the latest version of the entry. You may notice that the updated Atom entry was returned in the response from the POST that created the entry. We'll ignore that for now to introduce another part of the protocol. In the headers of the returned response was a Location: header which contains the EditURI for the newly created entry. If we wish to update the entry just created then we will need to do a GET on the URI returned above.
The URI returned above for our first entry was http://example.org/reilly/1 . We send the following request to port 80 of the server at example.org:
GET /reilly/1 HTTP/1.1 Host: example.org Content-Type: application/atom+xml |
The response is again an Atom entry, the same one we POSTed but with some more infomation filled in by the server:
HTTP/1.1 200 Ok
Content-Type: application/atom+xml
<?xml version="1.0" encoding='utf-8'?>
<entry
version="draft-ietf-atompub-format-01: do not deploy"
xmlns="http://purl.org/atom/ns#draft-ietf-atompub-format-01" >
<title>My First Entry</title>
<summary>A very boring entry...</summary>
<author>
<name>Bob B. Bobbington</name>
</author>
<issued>2003-02-05T12:29:29</issued>
<created>2003-02-05T14:10:58Z</created>
<modified>2003-02-05T14:10:58Z</modified>
<link rel="alternate" type="text/html">
http://example.org/reilly/2003/02/05#My_First_Entry
</link>
<id>http://example.org/reilly/1</id>
<content type="application/xhtml+xml" xml:lang="en-us">
<p xmlns="...">Hello, <em>weblog</em> world!</p>
</content>
</entry> |
Now that we have the entry we can modify it then PUT it back to the same URI to update the entry and the corresponding HTML resource.
Because we are using HTTP and in particular GET there are optimizations that we can apply to increase the performance of the protocol. Note that these optimizations are not available to protocols that try to push everything through POST. First, we can speed up the GET by using compression. We must tell the server that we will accept compressed documents when we send the GET request:
GET /reilly/1 HTTP/1.1 Host: example.org Accept-Encoding: compress, gzip Content-Type: application/atom+xml |
The response indicates what if any compression was used:
HTTP/1.1 200 Ok Content-Type: application/atom+xml Content-Encoding: gzip ...gzipped stuff goes here... |
XML is very amenable to compression and will gzip down to 1/2 to 1/3 of it's original size.
But wait, there's more optimizating we can do. The server could return an ETag: header in the response:
HTTP/1.1 200 Ok Content-Type: application/atom+xml ETag: 3948018403940943 ...gzipped stuff goes here... |
The ETag value is a key that we can use the next time we make a request to that URI. When making a request we can include an If-Match header in the request:
GET /reilly/1 HTTP/1.1
Host: example.org
If-Match: 3948018403940943
|
And the response if the entry is unchanged since the last time we requested it is a status code of 304 with no message-body. Now that's an optimization!
HTTP/1.1 304 Not Modified Content-Type: application/atom+xml |
Note that we can mix these two techniques:
GET /reilly/1 HTTP/1.1
Host: example.org
Accept-Encoding: compress, gzip
If-Match: 3948018403940943
|
Now if the entry is unchanged we will get a response with no message-body, and if it is changed the message-body may still be compressed.
These same optimizations can also be applied to the initial POST used to create an entry. We could specify Accept-Encoding: on the POST. In addition the response, if it includes the full Atom entry, can also contain an ETag: header, thus speeding up subsequent GETs.
The ETag: header also provides for protection against consurrent updates. When we wanted to speed up GETs we provided an If-Match: header in the GET request. That same If-Match: header can be provided on a PUT request and the request will only succeed if the entry is unchanged from when the ETag was generated.
The Atom Publishing Protocol has applications outside the domain of weblogs. It has, for example, been applied to Wikis. InterWiki supports an Atom Gateway[AtomGateway] which allows users to read and modify a wiki via the Atom Publishing Protocol. In addition PikiPiki has been extended to support the Atom Publishing Protocol[AtomWiki].
Other extensions have been proposed, such as the extensions currently supported on TypePad[TypePad], SixApart's hosted weblog service. There are namespaced elements added for manipulating photo albums, music lists, and book lists. For example, here is an example of a request that adds a song to a song list:
Request:
Request:
POST /t/atom/lists/list_id=1 HTTP/1.1
Host: www.typepad.com
X-WSSE: my credentials
Content-Type: application/atom+xml
<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="draft-ietf-atompub-format-01: do not deploy"
xmlns:song="http://sixapart.com/atom/song#"
xmlns:rvw="http://purl.org/NET/RVW/0.1/">
<song:title>Lucky Star</song:title>
<song:album>Kish Kash</song:album>
<song:artist>Basement Jaxx</song:artist>
<content>Good song.</content>
<rvw:value>4</rvw:value>
</entry> |
Response:
HTTP/1.1 201 Created
Content-Type: application/atom+xml
<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="draft-ietf-atompub-format-01: do not deploy"
xmlns:book="http://sixapart.com/atom/book#"
xmlns:rvw="http://purl.org/NET/RVW/0.1/">
<title>Basement Jaxx - Lucky Star</title>
<issued>2003-12-05T10:23:38Z</issued>
<id>tag:typepad.com,2003:listitem-153281</id>
<song:title>Lucky Star</song:title>
<song:album>Kish Kash</song:album>
<song:artist>Basement Jaxx</song:artist>
<content>Good song.</content>
<rvw:value>4</rvw:value>
<song:thumbnail>http://images.amazon.com/images/P/B0000DD56E.01.THUMBZZZ.jpg
</song:thumbnail>
</entry> |
XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.