|
Table of contents | Author | City | Company | Country | State/Province | Term | Interchange | ![]() |
Nakhimovsky, Alexander
, Associate Professor, Computer Science Dpt, Colgate University, Hamilton
New York
U.S.A.
Alexander Nakhimovsky received an MA in mathematics from Leningrad University in 1972 and a PhD in Linguistics from Cornell in 1979, with a graduate minor in Computer Science. He has been teaching computer science at Colgate University since 1985. He is the author, jointly with Tom Myers, of two WROX titles: Javascript Objects, 1998, Professional Java XML Programming, 1999; Nakhimovsky and Myers are also contributors of XML chapters to Professional JSP Programming, Professional WAP programming, and Professional Java Server Programming, J2EE Edition, all published in 2000. Nakhimovsky frequently speaks at professional conferences, most recently at the Wrox Wireless Conference, Amsterdam July 2000, Wireless DevCon conference in San Jose CA, December 2000 (where Myers and Nakhimovsky were Technical Program Directors), XML DevCon conference, New York, April 2001, and XML Europe, Berlin, May 2001.
We seek to combine the XML technologies of modular linked documents with the organizational technologies of open-source software to bring down the costs and improve the quality of online educational materials. Currently, the costs of such materials are high (both in terms of money and in terms of faculty time) and the results (mostly email archives, web pages and MSWord documents) are difficult to reuse. We are developing an infrastructure that combines XML technologies (XSLT, XLink, XPath queries) with the organizational tools of Open Source: an open-source licence and copyright, a four-way distinction between users, developers, committers and management, a mailing list and an FAQ for each group, the machinery of promotion, emphasis on volunteer contribution but a small infusion of money for very small paid staff.
A project currently under development at Colgate University is building such an infrastructure for an inter-disciplinary core course taught by many faculty from different departments. A followup project scheduled for the spring of 2002 will develop a body of materials for an introductory social science course that is taught at colleges and universities around the world. Both projects are done in cooperation with colleagues in social sciences. In our talk, we will present the latest version of the system.
Scenario 1. A young assistant professor is charged with the task of developing a new introductory biology course that puts more emphasis than the existing course on simulated experiments: for each traditional lab, there is a pre-lab in which the same experimental data is fed into a 3-d BioLab software package that allows the user to change some parameters while controlling others, and observe the resulting behavior of the system. The course will also utilize the wireless network recently installed in the classroom so the instructor can run simulated experiments during the lecture. The professor goes to the appropriate Open-Source repository and fills out a form:
The query returns four results, one from England, one from Scotland, one from the Technical University of Beijing (in Chinese but with lab summaries in English), and one from a large Midwestern university. The Scottish team is the initiator and official maintainer of the course, the English adopted it with minor modifications, while the Chinese and the Midwestern teams introduced major innovations in the middle part of the course. Our professor adopts the English version of the syllabus, takes most of her lab ideas from the Chinese but rewrites the assignments (she does not know Chinese), writes totally new labs and a new section of the textbook for the molecular biology part of the course, but otherwise uses the Midwestern version of the accompanying text. She uploads her course to the repository, and her English-language versions of the Beijing labs become an instant hit with the British. She is registered as a participant of the project, with her contributions clearly identified in the main log file. As she tests her new labs, she discovers that her BioLab software (she uses the latest version that her colleagues overseas cannot yet afford) does not work smoothly with the wireless network. She posts the problem to the course site, and within a week the Midwestern technical support posts a fix.
Scenario 2. An established professor close to retirement goes through his archive and finds an outline for an experimental course that he never taught because of other commitments and lack of support from the department. The professor goes to the repository and downloads the tools for converting legacy documents to standard XML formats. Using the tools, he brings the course up to date and registers it with the repository. Within a week, two young enthusiastic Australians write to him about some of their ideas that fit perfectly into his proposed framework. They take it upon themselves to develop additional assignments and write up explanations for them. They teach the course the following fall; the originator of the course teaches it in the spring semester after that, his last before retirement. The course proves very successful; the professor, now retired, is in active correspondence with his young Australian colleagues, and they are collaborating on a joint workbook to accompany the course. The workbook will be commercially published and distributed.
The main point of these examples is that in courseware, just as in software, there are clear benefits to cooperation: development work gets spread around, unavoidable "bugs" are discovered and fixed quickly, and new ideas find a ready framework for adoption and testing. A common complaint among academics is that teaching, especially new course development, takes time away from research. In a culture of open course development and maintenance this problem could be greatly alleviated. And we have not even mentioned the students who will get better and cheaper (free) teaching materials.
Another point about these examples is that, in many disciplines, courses consist of components, such as lecture plans, assignments, problem sets and labs. Humanities professors are, perhaps, more likely to view their courses as organic indivisible wholes, but in natural sciences, mathematics, and foreign language teaching, a course is usually a structured composition of discrete units, each with a clear set of pre- and post-conditions and a well-defined relationship to the rest of the course. (A similar difference can be observed at the level of entire programs.) A repository of reusable components would make it much easier to construct and modify a course.
Technologically, all the examples can be implemented today: some supporting software needs to be written but no breakthroughs are required. What is required is a leap of faith over the unavoidable suspicion that this whole scheme is contrary to human nature, or at least to the human nature of American academics of the early 21 century. This is where the experience of open-source software can serve both as a proof of concept and as an example to emulate.
As in open-source software projects (see http://xml.apache.org/roles.html), we assume four categories of participants:
Each category of users has an FAQ and a mailing list. Developers and committers reply to questions on the users' list.
In discussing college courses, we must, as often, distinguish types and instances. A course instance has such details as time and place of classes, an instructor identity, and grading policies. A course type has a number of topics that together form a coherent body of material; it has a bibliography of standard literature, and the supporting machinery of lecture plans, assignments and tests. These generic materials form the base from which the course-outline part of the syllabus is produced. Our application currently ignores instance-specific data and provides no tools for generating course outlines from course material repositories. However, manually constructed outlines will form part of the repository.
Here is a small sample of a course outline:
Mon., Sept. 11 Hingley, Ronald. Russian Writers and Society, 1825-1904 "Part Two: The Empire,” 41-107. London: World University Library, 1967.
Smith, R. E. F. and Christian, David. Bread and Salt: A Social and Economic History of Food and Drink in Russia, 5-26. Cambridge: Cambridge UP, 1984.
Wed., Sept. 13 Bassin, Mark. "National Identity and World Mission." In Imperial Visions: Nationalist Imagination and Geographical Expansion in the Russian Far East, 1840-1865, pp. 37-68. Cambridge: Cambridge UP, 1999.
The sample shows several common elements of such outlines: a topic within a discipline (Nineteenth-Century Russia) and daily assignments that consist of bibliographic citations , each modified by a "range specification" (pages, chapters). Not included in course outline for students are additional annotations on assignments, such as lecture plans, writing assignments and exam questions.
Topics and bibliographic citations constitute constant, unchangeable data. Everything else, including range specifications, can be treated as different kinds of annotations, either on bibliographic references or on other annotations. For instance, a writing assignment based on a specific reading is an annotation of a specific type on a combination of a bibliographic reference and a range specification. In our application, bibliographic references are implemented in XML as prescribed in Dublin Core specification ( ), while annotations are implemented as extended XLink structures ( ). Here is an example of a bibliographic record, which is an rdf:Description element, within a citation:
<cite>
<ident>C429</ident>
<rdf:Description>
<dc:title>Faces of the Caribbean</dc:title>
<dc:creator>John Gilmore</dc:creator>
<dc:format>Book</dc:format>
<dc:identifier>ISBN 1583670289</dc:identifier>
<dc:publisher>London: Latin America Bureau</dc:publisher>
<dc:date>2000</dc:date>
</rdf:Description>
<descr>This is <em>arbitrary XHTML</em> for display or extraction by XSLT.</descr>
</cite>
As you can see, a citation contains an rdf:Description element with properties as specified in Dublin Core. The following namespace declarations are assumed:
<citeDB xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xlink="http://www.w3.org/1999/xlink">
<!-- conforms to the Dublin Core Proposed Recommendation,
http://dublincore.org/documents/2001/09/20/dcmes-xml/
-->
For both citations and annotations, we also record submission data: who submitted the items and when. All together, we have four kinds of items: topics, citations, annotations and submissions. Topics are simply strings that serve as search categories; the rest are Node objects.
A major feature of the design is that we arrange for each Node object to have a unique ID consisting of a letter (C, A or S) followed by an integer. This ID (not to be confused with a DTD ID attribute) is stored as an <ident> element, the first child of every citation, annotation or submission Node in our repository. The integer that follows the S in a submission node ID is the same as the integer in the ID of the corresponding citation or annotation node. The XLink structure of a node looks like this:
<elt xlink:type="extended"> <ident>A37</ident> <elt xlink:label="this" xlink:href="A37" xlink:type="locator"/> <elt xlink:label="prevAnnot" xlink:href="A13" xlink:type="locator"/> <elt xlink:label="someCite" xlink:href="C532" xlink:type="locator"/> <elt xlink:to="prevAnnot" xlink:from="this" xlink:type="arc"/> <elt xlink:to="someCite" xlink:from="this" xlink:type="arc"/> <elt xlink:to="someCite" xlink:from="prevAnnot" xlink:type="arc"/> </elt>
The first xlink:label is always "this", and its href is the same as the "ident" field. This greatly facilitates some XPath queries and other XSLT manipulations. The arc elements within a node express connections from "this" node to any other in the same format as third-party links.
As shown, xlink:href attributes within locators are really shorthands for a JSP URI with the node ID in the query string:
http://localhost:8080/OpenXml/vdb/getRef.jsp?ref=A13
where getRef.jsp, in its simplest form, retrieves the node by its unique ID and uses an identity transform to serialize it:
<%@ page errorPage="error.jsp"
import="javax.xml.transform.*,
javax.xml.transform.stream.*,
javax.xml.transform.dom.*,
org.w3c.dom.*"
%><jsp:useBean id="citations" class="java.util.Vector" scope="application"
/><jsp:useBean id="annotations" class="java.util.Vector" scope="application"
/><jsp:useBean id="submissions" class="java.util.Vector" scope="application"
/><jsp:useBean id="appCache" class="java.util.Hashtable" scope="application"
/><% // we check log-in elsewhere
String key=request.getParameter("ref");
Object obVal=appCache.get(key); int val=-1;
if(null==obVal || !(obVal instanceof Integer)){
%><noRef><%= key %></noRef><%
} else {
Node node=null;
val=((Integer)obVal).intValue();
try{
if(key.startsWith("S"))
node=(Node)submissions.get(val);
else if(key.startsWith("A"))
node=(Node)annotations.get(val);
else if(key.startsWith("C"))
node=(Node)citations.get(val);
}catch(Exception ex){
out.write(ex.toString()+"an appropriate error message");
}
if(node!=null){
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer trans = tFactory.newTransformer();
// for non-identity transformations, give an XSLT parameter to newTransformer()
trans.setOutputProperty(
javax.xml.transform.OutputKeys.OMIT_XML_DECLARATION,"yes");
trans.setOutputProperty(
javax.xml.transform.OutputKeys.INDENT,"yes");
trans.transform(new DOMSource(node), new StreamResult(out));
}
else { %> <noRef><%= key %></noRef> <% }
}
%>
This JSP can be improved by moving part of its code into a Java class once the code becomes more stable.
We want the application to support at least these actions: add, edit, change status and search. We do not anticipate much if any deletion from the repository. The add operation inserts either a bibliographic reference or an annotation. The change status operation (available to committers only) can switch the status from submitted to committed. The search operation searches the repository using XPath expressions. from the repository
The way these operations are implemented depends on the implementation of the repository. We are considering three implementations:
We refer to these three versions as vdb, rdb and xdb, respectively. Since this is an academic project, we can afford to pursue all three implementations for a while before deciding which one will be the production version.
A major problem for all three versions is synchronization. In database versions, both relational and native XML, synchronization will be handled by the database. In the custom data structure version, synchronization is handled by the Java synchronized types, Vector and Hashtable. Here is how the search operation is implemented in vdb:
<%@ page errorPage="error.jsp"
import="javax.xml.transform.*,javax.xml.transform.stream.*,
javax.xml.transform.dom.*,org.w3c.dom.*"
%><jsp:useBean id="citations" class="java.util.Vector" scope="application"
/><jsp:useBean id="annotations" class="java.util.Vector" scope="application"
/><jsp:useBean id="submissions" class="java.util.Vector" scope="application"
/><jsp:useBean id="topicNames" class="java.util.Vector" scope="application"
/><jsp:useBean id="appCache" class="java.util.Hashtable" scope="application"
/><jsp:useBean id="sessCache" class="java.util.Hashtable" scope="session"
/><jsp:useBean id="refSet" class="java.util.Hashtable" scope="session"
/><jsp:useBean id="unm" class="java.lang.StringBuffer" scope="session"
/><%
if(unm.length()==0)
{ %><jsp:forward page="index.jsp"/><% } // not yet logged in.
%><%
boolean val=false;
String xPath=request.getParameter("XPath");
String refType=request.getParameter("type");
String action=request.getParameter("action");
if("clear".equals(action)){refSet.clear();}
else if("show".equals(action)){/* do nothing */}
else { // add or del
java.util.Vector xmlVec="annotations".equals(refType)?annotations:citations;
for(int i=0;i<xmlVec.size();i++){
Node srchNode=(Node)xmlVec.get(i);
String nodeVal=srchNode.getFirstChild().getFirstChild().getNodeValue();
if((val =org.apache.xpath.XPathAPI.
eval(srchNode,xPath,
new org.apache.xml.utils.PrefixResolverDefault(srchNode))
.bool()) )
{if("del".equals(action))
refSet.remove(nodeVal);
else refSet.put(nodeVal,new Boolean(val));
}
}
}
%><html><head><title>Search result</title></head>
<body><form><textarea rows="30" cols="60">
<% out.write("<citeDB_refSet>");
Transformer trans = (Transformer)sessCache.get("trans");
for(java.util.Enumeration e=refSet.keys();e.hasMoreElements() ;) {
String key=e.nextElement().toString();
String kval=refSet.get(key).toString();
int loc=( (Integer)appCache.get(key) ).intValue();
Node node=null;
try{
if(key.startsWith("S"))node=(Node)submissions.get(loc);
else if(key.startsWith("A"))node=(Node)annotations.get(loc);
else if(key.startsWith("C"))node=(Node)citations.get(loc);
}catch(Exception ex){
out.write(ex.toString()+"an appropriate error message");
}
if(node!=null){
out.write("<refSet key=\\""+key+"\\" val=\\""+kval+"\\">");
trans.transform(new DOMSource(node), new StreamResult(out));
out.write("</refSet>\
");
}
else {
out.write("<refSet key=\\""+key+"\\">");
out.write(kval);
out.write("</refSet>\
");
}
}
out.write("</citeDB_refSet>");
%></textarea></form>
</body>
</html>
To give an example, this implementation of the search operation will select all books by Gilmore via this XPath query:
contains(rdf:Description/dc:creator,"Gilmore")
The selected records will be added or deleted depending on the value of the action parameter.
As of this writing (mid-October, 2001), the custom data structure version is the most advanced. By mid-December, we expect all three versions to support the basic operations, so we can test and compare performance. We hope that dbXML will prove to be the best choice. In mid-January, in collaboration with a domain expert, we will start a collaborative open-source courseware project for an Introduction to Anthropology course.
|
Table of contents | Author | City | Company | Country | State/Province | Term | Interchange | ![]() |