Abstract
Electrocardiogram (ECG) data have been traditionally generated by multiple software applications on various platforms. Furthermore local data storage and distribution uses different formats and structures. These data modelling and distribution tasks should consist of flexible and inexpensive tools to enhance pattern recognition and visualisation capabilities of humans and machines. There is an increased need to promote the development of standards in order to support a seamless exchange and migration of ECG data as well as the native integration into Electronic Patient Records (EPR) and medical guidelines. Such models should be platform-independent, flexible and open to the scientific community. In the case of ECG data interpretation, an important pre-requisite is a comprehensive data description independent of the number of channels, instrumentation platform and type of experiments. Additionally, an ECG record should include annotations relating to the acquisition protocols, patient information and analysis results.
The Food and Drug Administration (FDA), Center for Drug Evaluation and Research, has proposed recommendations for the exchange of time-series data. The projected standard consists of a hierarchical data structure for the representation of signals, including ECG data, which ideally would be encoded in an XML file. Recent advances include I-Med, which is a XML -based format for clinical data bundled with a domain-independent interface for exchanging several types of medical information. Its major goal is to provide a unique platform for clinical transactions. I-Med messages can include ECG records, which may be described by basic features, such as QRS duration (i.e. time interval necessary for ventricular depolarization) and text-based interpretations.
This paper discusses the minimum set of information needed for a meaningful representation and storage of electrocardiogram signals. It has been synthesized from existing recommendations and compiled into an XML schema (ecgML). The accompanying application comfortably supports medical tasks such as pattern recognition and identification of relevant wave markers. More recently, eXtensible Stylesheet Language (XSL) transformations are developed to convert "raw" ecgML files into data mining formats such as MatLab (for further analysis), Scalable Vector Graphics (SVG) (for graphical visualisation) and audio format. Thus, ecgML is a useful tool to facilitate the representation, exchange and interpretation of ECG information.
Keywords
Table of Contents
Electrocardiography is one of the most important non-invasive diagnostic methods, which can be performed at a low cost and allows the early recognition of coronary heart disease. In today’s distributed healthcare environment, ECG data are commonly acquired, stored and analysed using different formats and software platforms. Various alternatives used for the management and exchange of ECG data still exist. Medical informatics will fully exploit the benefits from its research only when data can be openly shared and interpreted. There is an increasing need to develop cross-platform solutions to support biomedical training, decision-making and telemedicine applications [1].
ECG data have been traditionally recorded using flat file formats, such as the Massachusetts Institute of Technology and Beth Israel Hospital (MIT-BIH) file library [2]. This type of data format lacks the information necessary to support a meaningful analysis, interoperability and integration of multiple resources. In 1980, a large international project, sponsored by the European Commission, was launched to develop Common Standards for Quantitative Electrocardiography (CSE) project. The main findings of the first CSE study include standardization of computerised definitions of waves and the references for each beginning and end point of the inter-wave components of the ECG [3]. In 1993, a Comité Européen de Normalisation Technical Committee 251 (CEN/TC251) project team developed the Standard Communications Protocol for Computer-assisted Electrocardiography (SCP-ECG)[4]. The standard is relatively well established for the interchange, encoding and storage of digital ECG data. The data level in the standard includes the ECG signal, patient demographics and ECG administrative data, as well as measurement and interpretation results. Although this standard is supported by many manufacturers of ECG equipment, the utilisation of SCP-ECG has demonstrated some disadvantages:
In 1987, Health Level Seven (HL7) was founded to develop standards for the electronic interchange of clinical, financial and administrative information among independent healthcare oriented computer systems. This standard currently addresses the interfaces among various systems that send or receive patient admission/registration, queries, orders, results, clinical observations, billing, and master file update information. The HL7 integration approach focuses on synchronizing the databases of multiple application systems. As a de facto standard for the electronic exchange of clinical and administrative data, an HL7 message is able to represent ECG, waveform, measurements, a computer analysis of the waveform, and demographics data. The main problems of HL7 include [5]:
The Digital Imaging and Communications in Medicine (DICOM) was originally created as a protocol for image data exchange. Due to popular demand (from the ECG community) for purposes where biosignals are collected in connection with a medical imaging procedure, supplement no. 30 on waveforms has been developed to integrate waveform storage into DICOM [6]. This includes ECG, electrophysiological and hemodynamic curve data. However the implementation of this format requires understanding of the DICOM philosophy, which is not possible by reading Supplement 30 alone.
Since its adoption as a World Wide Web Consortium (W3C) recommendation in 1998, XML and a number of related W3C recommendations are shaping the future of the web, providing simple, elegant, and scalable interoperability solutions [7]. Developed as a subset of Standard for General Markup Language (SGML) in 1996 to "be straightforwardly usable over the Internet", XML soon became a ubiquitous syntax for data and data-exchange over the Internet and presents new opportunities for the representation and exchange of clinical information. As a result, committees within standardization organizations in healthcare such as CEN/TC251 , HL7, American Society for Testing and Materials (ASTM), etc. are currently working on recommendations for the use of XML in healthcare. The use of XML syntax for the exchange of electronic patient records was shown in all its aspects in Synapses [8] and SynEx [9] and their implementations[10] [11]. Synapses concentrated on the specification of a Federated Healthcare Record (FHCR) server (including data model and format definition), which provides integrated access to a record’s distributed components. The SynEx project concerned integrating a number of components to form an information system from which client applications could access a wide range of data in support of the healthcare business [12]. However, the detailed description of specific data models (e.g. for ECG data) hasn’t been part of it.
The FDA Centre for Drug Evaluation and Research has proposed recommendations for the exchange of time-series data. It includes a hierarchical structure for the representation of signals, including ECG data, which may be encoded as an XML file. This protocol focuses on the acquisition of multiple records from different subjects within a single file [13].The most recent document specifying the XML data format for ECG was issued in the middle of April 2002. However, the data model (specified in a Document Type Definition (DTD)[14] includes presentation information that we believe should be kept outside in order to follow the principle of separating content and presentation information, such as elements MinorTickInterval, MajorTickInterval and LogScale.
Recent advances include I-Med, which is an XML-based format for clinical data [15]. This project consists of a domain-independent interface for exchanging several types of medical information. Its major goal is to provide a unique platform for clinical transactions. These messages can include ECG records, which may be described by basic features, such as QRS duration and text-based interpretations. One major limitation of this solution is that it partially addresses important ECG data content-definitions.
There is a need to harmonise the representation of digital ECG data originating from the full spectrum of devices along with annotations for events, and to include necessary associated information, such as patient identification, interpretation and other clinical data. The following hierarchical structure is proposed to address such concerns. In this paper terms written in bold prints (i.e. bold and italic) represent either XML element or attribute names. Element names are made of concatenated words with the first letter of each word capitalised ("upperCamelCase"). Attribute names satisfy the same rule except for the first word ("lowerCamelCase").
Each patient record starts with a root element ECGRecord , which is uniquely identified by its attribute studyID . The StudyDate and StudyTime elements represent the latest time record of the study of the ECG recording. Diagnosis contains a text version of the latest diagnostic interpretation of the ECG,while MedicalHistory is a description of the medical history of a patient's clinical problems and diagnoses. There are two main components for each record: one PatientDemographic element and one-or-more Record elements. The tree diagram of the ECGRecord element is given in Figure 1.
PatientDemographics contains information of general interest concerning the person from whom the recording is obtained, such as demographic data (e.g. patientID , Name , etc.) and contact information (e.g. Address , etc.). This component is required in each record.
Record , shown in Figure 2, represents the physical storage for the basic content of an ECG recording. The AcquisitionDate and AcquisitionTime elements specify date and time the record was taken. investigatorID and siteID are used to identify the responsible person and institution for the recording. There are three main components: zero-or-one RecordingDevice , zero-or-one ClinicalProtocol , and one-or-more RecordDate .
RecordingDevice describes the device that generated the data, while ClinicalProtocol may include information relating to a patient’s clinical report. RecordData is a key ecgML element. There can be multiple RecordData elements within a file, which are identified by their Channel element names. The DICOM lead labelling format is recommended for this purpose[6]. RecordData includes three main sub-components: Waveforms , Annotations and Measurements . The corresponding tree diagrams are illustrated in Figure 3, Figure 4 and Figure 5.
Based on the FDA-recommended PlotGroup format [14], Waveforms are represented by a series of values along two dimensions X, Y ( XValues and YValues ). Annotations would typically be used to describe events specific to the corresponding channel. It defines a time point or interval, which can be used for performing the measurements. This consists of a collection of PointNotation and WaveNotation elements.The Measurements element contains a list of Values (i. e. the measurements of each recorded channel). Each Values element may be associated with a label and a measurement unit .
As mentioned earlier, the FDA, together with a number of other institutions, has developed and published an XML vocabulary [13] [14] to represent collected time-series data. However, there are some significant differences between the FDA proposal and ecgML. The FDA proposal is intended to represent collected biological data, including ECG, electroencephalogram (EEG), or other time series data such as temperature, pressure and oxygen saturation. The main goal is to facilitate the submission of the biological data and to make sure that accuracy and consistency of the measurements made from the collected biological data is achieved. It is important for the FDA to view the biological data in an appropriate way. Thus, the data model (specified in a DTD) includes some presentation information, including elements such as MinorTickInterval, MajorTickInterval and LogScale. On the other hand, ecgML is specific to ECG signals. There are some elements directly related to ECG waveforms, e.g. the elements Pwave , QRSwave and Twave . The purpose of ecgML is to develop an open and transparent way of representing, exchanging and mining ECG data. Therefore, ecgML not only consists of some important components, which may be used to perform knowledge discovery in ECG data (e.g. ClinicalProtocol , Diagnosis and Measurements ) but also follows the principal of separating content and presentation information, which will exhibit great advantages when using ecgML in combination with inter-media transformations (see below).
A series of tools are being developed to assist users in exploiting ecgML-based applications. These include an XML-based ECG record generator, ECG parser and ECG viewer. The generator will automatically produce XML-based ECG records from existing ECG databases, e.g. the MIT-BIH database [2]. The ECG parser allows the user reading the ECG records and access their contents and structure, whereas the ECG viewer provides onscreen display of the corresponding waveform data, shown in Figure 6. It shows all annotation information of the individual waveform. The hierarchical structure of the XML-based ECG record, including every elements and attribute is displayed on the left hand side. It can be expanded and shrunk at any level. The right hand side shows an individual part of the ECG waveform chosen from the ecgML structure. The viewer graphically locates boundaries (i.e. beginning, peak, and end) of the P, QRS and T waveforms for each selected QRS complex.
Based on advantages of XML technologies, ecgML has the ability to present a system-, application- and format-independent solution for representation and exchange of ECG data. Moreover, a distinct separation of content and presentation (among other components such as links and semantic) exhibits a remarkable advantage over existing systems where information is merged and intertwined with its representation format. Figure 7 exemplifies a scenario where the raw ECG data is kept in an ecgML data file and therefore independently from possible presentation information. Various XSL Transformations (XSLT) (stored as XSL files and applied on the fly, transparent to the user) convert the ecgML source into user- and/or application-specific data formats, such as Moving Picture Experts Group (MPEG) (audio), MatLab (text) and SVG/Portable Network Graphics (PNG) (graphics). The centralised storage of the ECG record and dynamic creation of data representations avoids redundancy.
The data and metadata contained in an ecgML record may be useful to improve pattern recognition in ECG applications. It would also aid the implementation of automated decision support models such as case-based reasoning. The proposed ecgML may also be significant for problems such as future proof storage, context-sensitive (textual) search of patterns in ECG data, and its native inclusion into medical guidelines. Figure 8 illustrates the utilisation of map files to convert "raw" ecgML files into different tabular data, which will be imported into data mining systems for further analysis.
ecgML will enable the seamless integration of ECG data into Electronic Patient Records and medical guidelines. This protocol can support data exchange between different ECG acquisition and visualisation devices. The accompanying application comfortably supports medical tasks such as pattern recognition and identification of relevant wave markers. The advantages of separating content from presentation information has proven very successful, where ECG data stored in the ecgML can be delivered in customised output format to suit different devices and applications. Thus, ecgML is a useful tool to facilitate the representation, exchange and interpretation of ECG information. Further research will address the following issues.
How does ecgML affect storage capacity?
Does on-the-fly compression (as used by Hypertext Transfer Protocol (HTTP) 1.1) make a difference in terms of transmission speed?
Is it feasible to use ecgML in applications such as 24 hour monitoring?
Does ecgML data contain all the significant information required for ECG analysis?
[1] A. Värni, B. Kemp, T. Penzel, A. Schlögl: Standards for biomedical signal databases. IEEE Engineering in Medicine and Biology 2001, 20(3): 33-37.
[2] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, H. E. Stanley: PhysioBank, PhysioToolkit, and Physionet: Components of a New Research Resource for Complex Physiologic Signals. Circulation, 2000(June 13), 101(23): e215-e220. [http://circ.ahajournals.org/ cgi/content/full/101/23/e215].
[3] J.L. Willems, P.Arnaud, J.H. van Bemmel, R.Degani , P.W. Macfarlane and Chr. Zywietz: Common Standards for Quantitative Electrocardiography: Goals and Main Results. Methods of Information in Medicine, 1990, vol.29, pp.263-271.
[4] ENV 1064 standard communications protocol for computer-assisted electrocardiography. European Committee for Standardisation(CEN), Brussels, Belgium, 1996.
[5] J. Sable: The HL7 RIM (Reference Information Model) [http://hmi.missouri.edu/Course_Materials/Residential_Informatics/semesters/W2000_Materials/401_hales/hl7_1.ppt]
[6] DICOM Suppl. 30, Waveform interchange, Nat. Elect. Manufacturers Assoc.: ARC-NEMA, Digital Imaging and Communications, NEMA, Washington D.C., 1999.
[7] Extensible Markup language (XML) [http://www.w3.org/xml/].
[8] Synapses Homepage [http://www.cs.tcd.ie/synapses/public/].
[9] SynEx Homepage [http://www.gesi.it/synex/].
[10] B. Jung, J. Grimson: Synapses/SynEx goes XML, In Proceedings of the Medical Informatics Europe '99 Conference, 1999; Technology and Informatics, 68, IOS press, Amsterdam, 1999: 906-911.
[11] B. Jung, E. P. Andersen, J. Grimson: Using XML for Seamless Integration of Distributed Electronic Patient Records. In Proceedings of XML Scandinavia 2000 conference, Gothenburg, Sweden, May 2000.
[12] J. Grimson, G. Stephens, B. Jung, W. Grimson, D. Berry, S. Pardon: Sharing Health-Care Records over the Internet. IEEE Internet Computing, May/June 2001, 5(3): 49-58.
[13] FDA application: Proposed Standard for Exchange of Electrocardiographic and Other Time- Series Data [http://www.fda.gov/cder/regulatory/ersr/ECGdata.htm].
[14] FDA XML Data Format Design Specification [http://www.cdisc.org/ discussions/EGC/FDA%20_XML_Data_Format_Design_Specification_DRAFT_C.pdf].
[15] I-Med Homepage [http://www.hnbe.com/healthweb/imedpub/].
American Society for Testing and Materials
Comité Européen de Normalisation Technical Committee 251
Common Standards for Quantitative Electrocardiography
Digital Imaging and Communications in Medicine
Document Type Definition
Electrocardiogram
electroencephalogram
Electronic Patient Records
European Society for Engineering and Medicine
Food and Drug Administration
Federated Healthcare Record
Health Level Seven
Hypertext Transfer Protocol
Institute of Electrical and Electronics Engineers
Massachusetts Institute of Technology and Beth Israel Hospital
Moving Picture Experts Group
Northern Ireland Bioengineering Centre
Portable Network Graphics
Standard Communications Protocol for Computer-assisted Electrocardiography
Standard for General Markup Language
Scalable Vector Graphics
Trinity College Dublin
World Wide Web Consortium
eXtensible Markup Language
eXtensible Stylesheet Language
XSL Transformations
![]() ![]() |
Design & Development by deepX Ltd. |