Abstract
XML editors can be divided into text editors and structure editors. In a structure editor, the user interacts with the document as an abstract tree of elements. In a text editor, the user interacts with a document as a sequence of characters or lines of text.
In a normal text editor, a user is not constrained in how they can modify the content of the document: any text can be inserted at any point and any range of text can be deleted. Preserving this characteristic in an XML editor, while providing useful support for XML editing and acceptable performance, presents some challenges.
A normal XML parser or validator starts at the beginning of the document, and processes the entire document until it reaches the end or possibly until it encounters an error. This kind of implementation is not useful for an XML editor. Completely reprocessing the document on every edit cannot scale to large documents. To solve this problem, XML processing must work incrementally: as the document is processed, additional information is recorded, so that when the document is subsequently modified, the necessary reprocessing is minimized.
Three kinds of XML processing will be addressed: XML 1.0 parsing, XML Namespaces processing and RELAX NG validation. This session will describe two algorithms that allow all these three kinds of processing to be performed incrementally. These algorithms have been implemented for GNU Emacs completely in Emacs Lisp. This is a particularly challenging environment, since the implementation of Emacs Lisp in GNU Emacs is much slower than the typical implementation of a language such as C++, Java or C# in which a text editor would usually be written. Moreover, GNU Emacs lacks any support for multithreading.
Note that this work is also relevant W3C XML Schemas, since, for the purposes of validation, W3C XML Schemas (minus integrity constraints) can be translated into RELAX NG schemas.
Keywords
![]() ![]() |
Design & Development by deepX Ltd. |