XML 2002 logo

CBML: Comic Book Markup Language

Abstract

This paper/presentation discusses the development of Comic Book Markup Language (CBML) , or Comic Book Markup Language, an XML vocabulary (with Document Type Definition (DTD) and schema representations) designed to accommodate the XML encoding of comic books and graphic novels.

With the emergence of scholarly disciplines such as cultural studies and new areas of interest in traditional scholarly fields such as English and American studies, comic books and graphic novels have recently become the subject of serious critical attention. Additionally, comic books and the mythologies they have spawned continue to be a vital part of our popular culture and national consciousness. Witness the surprising and almost unprecedented popularity of the recent [ Spider-Man ] feature film. Before that, the [ X-Men ] comic book hero film was also enormously successful. In the 70s, 80s, and 90s, the [ Superman ] and [ Batman ] film franchises produced regular blockbusters. In 1992, Art Spiegelman won the Pulitzer Prize for his [ Maus ] , a comic book narrative of holocaust survival. More recently in 2001, Michael Chabon won the Pulitzer Prize for his novel The [ Amazing Adventures of Kavalier & Clay ] , which relates the experience of two Jewish cousins working in the nascent comic book industry at the beginning of WWII. These few examples, which are frequently discussed in school and university classrooms, serve to demonstrate the continuing and perhaps increasing importance of comic books as an art form and cultural touchstone.

Unfortunately, many of the comic books that might be possible subjects of scholarly and critical attention are not widely available to researchers. The reprints currently available are generally expensive and/or woefully incomplete. For instance, the Marvel Comics "essential" series provides affordable reprints of many of Marvel's classic comics, but these editions are printed on cheap newsprint and in black and white, lacking the bold colors of the original comics. In addition, the vast majority of comic book reprints lack the many interesting advertisements that are an integral part of these publications—especially when they are considered by scholars as cultural artifacts. For instance, when studying [ The Fantastic Four ] comic books from the 1960s, it is enlightening to juxtapose the sometimes stereotypical female roles assigned to Sue Storm, the Invisible Girl, the lone woman among the super hero team, with the advertisement recruiting young boys to sell [ Grit ] newspaper—the ad contains a mail-in form which requires the sender to answer the question, "Are you a boy?"

In order to facilitate the preservation, study, and analysis of these important cultural artifacts, the original comic books need to be digitized, and an XML vocabulary suitable for capturing the varied and complex data and metadata of the comic book needs to be developed. My presentation and paper will discuss some of the challenges encountered in developing the CBML XML vocabulary, and in keeping with the conference theme of technological integration and interoperability, I will also discuss issues related to the integration of XML and XML query technologies with various imaging technologies to provide users with an interface that combines XML-encoded textual data and metadata with various representations of the digitized page image. Comic books and graphic novels present a unique combination of text and graphics. The text—from the familiar speech and thought balloons to the graphically rendered POW! SMASH! BANG! sound effects—is inextricably bound with the image. The digitized comic book—no matter how meticulously encoded—cannot be sufficiently represented in XML alone; the page image is also required. An interface that integrates the comic book page images with XML-encoded textual data and metadata provides an extremely powerful tool for researchers, scholars, and students interested in comic books as art form and cultural touchstone.

Keywords