XML 2002 logo

HTML Parsing in Java for Accessibility Transformations

Abstract

As more people begin to actively access the Web, more people who were previously unable to easily go online are coming in droves. Many of these people have minor and major impediments to fully experiencing a full complement of web sites. Disabilities from normal aging of vision, to motor disabilities, to moderate vision loss, commonly inhibit users from fully experiencing the web. While severe disabilities, especially blindness, have already been addressed, usually with special-purpose software or devices, those left in the middle with mild to moderate disabilities can feel left out.

The Web Accessibility Initiative (http://www.w3.org/WAI/) and the W3C's Web Content Accessibility Guidelines (http://www.w3.org/TR/WCAG10/) explain to authors how to create web content that is accessible. However, not all web sites adhere to these principles, and require changes to be accessible. Even when the guidelines are followed, the page can be made even more accessible for more users to be able to experience the content more fully.

As part of the Web Adaptation Technology Project [WA], we have developed browser extensions and server services that allow users to make dynamic changes to web pages based on their personal preferences. This paper will focus on dynamic visual changes to web sites, accomplished mainly via HTML/XML manipulation of the document, and the problems encountered along the way. These problems include difficulty in changing the text size to suit the user, due to web authors’ different ways of specifying font sizes. Color contrast can make web page viewing difficult for some users, and changing to a high-contrast color combination can help. However some pages use transparent images, which then are difficult to see, but can be adjusted. While style sheets allow flexibility in rendering the page, their specifications can be challenging to determine while parsing a DOM. Examples of difficult web sites will be shown, different parsing solutions demonstrated, and recommendations for web authors will be made. Even when web pages do not meet the WAI or other accessibility guidelines, dynamic changes can be made to make the pages more accessible, based on a user’s individual preferences.

Information about this joint project between IBM and SeniorNet [IBM-SNET] is available at: http://www.ibm.com/ibm/ibmgives/grant/helping/seniornet.shtml

This project is implemented in Internet Explorer using a Browser Helper Object (BHO)[BHO] [BHO2] [Roberts] written in Java, which gives program access to the document object before it is rendered in the browser. The object model is an extension of the W3C DOM that permits the BHO to modify the document's rendering and to handle user interface events in the document. Our implementation uses Java-COM to call the Microsoft MSHTML [MSHTML] objects that allow query and manipulation of the html document. We compile the Java implementation into a DLL that is registered to be called by the browser.

This paper will describe some transformations that help persons with disabilities view web pages more easily based on their personal preferences, shows some ways some simple transformations can be done with existing browser tools, and shows how Web Adaptation Technology can make those and other changes more easily and consistently for the user. DOM parsing to do the DOM manipulation transformations is shown in the Microsoft APIs that are associated with BHOs. Suggestions are included to help web authors make pages more accessible to more users, as well as more modifiable by technologies such as the Web Adaptation Technology.