Extensible Markup Language (XML) was first designed as a complete, platform-independent and system-independent environment for the delivery and authoring of information resources over the World Wide Web (hereinafter, “Web”). XML was intended to supplement and in some cases replace Hypertext Markup Language (HTML), which had been the prevalent method of authoring and referencing content over the Web.
XML is a set of technologies that define a universal data format for tree-based, hierarchically formed information. A number of new specifications extending its range and power, such as Extensible Stylesheet Language (XSL), Document Object Model (DOM), and XSL Transformations (XSLT), are being developed or have already been developed. XML offers the advantages of platform independence and Web awareness, and many XML tools are open source and freely available. Thus XML technologies can provide a simple and low cost solution for enterprise-wide access to clinical information including medical reports.
Because XML is used to describe information as well as structure, it is particularly well suited as a data description language. One of XML's particular strengths is that it allows entire industries, academic disciplines, and professional organizations to develop sets of Document Type Definitions (DTDs) and Schemas that can serve to standardize the representation of information within those disciplines. Given a set of DTDs and Schemas, content material that is modeled in conformance with the DTDs and Schemas can be processed by applications that are developed for these DTDs and Schemas.
A further advantage of the use of XML is the wealth of tools that are available for the processing of XML-compatible data. Of particular significance, the “Extensible Stylesheet Language” (XSL) is a language for expressing stylesheets, and the “XSL Transformations” (XSLT) is a language for transforming XML documents into other documents, using stylesheets.
To facilitate a uniform understanding of an XML encoding of medical reports, it is necessary to define a DTD for the reports. A DTD is used to describe the permissible elements and attributes in an XML document, primarily in terms of structures and restrictions of “document-like” objects such as articles and books. Such a DTD has been derived from a Unified Modeling Language (UML) model of the Digital Imaging and Communication in Medicine (DICOM) Structured Reporting (SR) information model. The DICOM SR is based on a relational data technology, and has been standardized by the National Electrical Manufacturers Association (NEMA). Supplement 23: Structured Reporting Storage SOP Classes to the DICOM Standard, published by the DICOM Standards Committee, 1300 N. 17th Street, Rosslyn, Va. 22209 USA, and incorporated by reference herein.
The DICOM SR standard, and the SR Documentation Model upon which it is based, improves the expressiveness, precision, and comparability of documentation of diagnostic images and waveforms. DICOM SR supports the interchange of expressive compound reports in which the critical features shown by images and waveforms can be denoted unambiguously by the observer, indexed, and retrieved selectively by subsequent reviewers. Findings may be expressed by the observer as text, codes, and numeric measurements, or via location coordinates of specific regions of interest within images or waveforms, or references to comparison images, sound, waveforms, curves, and previous report information. The observational and historical findings recorded by the observer may include any evidence referenced as part of an interpretation procedure. Thus, DICOM SR supports not only the reporting of diagnostic observations, but the capability to document fully the evidence that evoked the observations. This capability provides significant new opportunities for large-scale collection of structured data for clinical research, training, and outcomes assessment as a routine by-product of diagnostic image and waveform interpretation, and facilitates the pooling of structured data for multi-center clinical trials and evaluations.
Methods and systems have been developed for transforming the DICOM SR specification into a UML model to facilitate an understanding of the DICOM SR by non-DICOM systems analysts and system designers (see copending U.S. patent application “UML MODEL AND XML REPRESENTATIONS OF DIGITAL IMAGING AND COMMUNICATIONS IN MEDICINE STRUCTURED REPORTS (DICOM SR)”, Ser. No. 09/686,401, filed 10 Oct. 2000 for Alfredo Tirado-Ramos, Jingkun Hu, and Yasser alSafadi, and incorporated by reference herein.) A conversion system that converts DICOM SR information from the DICOM relational model into an XML representation has been created. By providing a mapping between DICOM SR and XML, the DICOM SR content material can be more easily processed by application programs that are DICOM-specific, such as medical analysis programs, as well as by application programs that are not DICOM-specific, such as routine clerical or data-management programs.
A medical report must satisfy a number of constraints contained in the DICOM SR specification. Constraints can take the form of specifying the maximum and minimum values for a given field or requiring a field to be present if some other field has certain values. Document Type Definitions (DTDs), as used in XML documents, unfortunately are extremely limited in their capability to specify these constraints conveniently. Constraints can be expressed with a general purpose programming language such as C or Java. However, since these languages are procedural in nature, code will have to be compiled, linked and executed in order to check the constraints. This departs from the declarative nature of an XML document.
XML Schema, recently approved as a Recommendation from the Worldwide Web Consortium (W3C), allows rich structure and data type definition (among others) in XML documents, providing more expressive power. “Rich structure” refers to an abundance of detail regarding the attributes and constraints of the fields encoded. Copending U.S. patent application Ser. No. 09/818,716“DICOM XML DOCUMENT TYPE DEFINITION (DTD) AND SCHEMA GENERATOR”, filed 27 Mar. 2001 for Jingkun Hu, and Kwok Pun Lee, discloses a system and method that facilitate the creation of XML Document Type Definitions (“DTDs”) and XML Schemas that correspond to the DICOM SR standard.
It is relatively straightforward to express constraints involving a single element of a DICOM information object definition (IOD) with XML Schema. For example the maximum length of a string can be easily constrained. An example of how this can be done is explained later. However, the definition of an IOD also has a number of constraints that cannot be easily expressed with Schema. In particular those involving multiple elements in an IOD such as a constraint that says an element must be present if another element has a specified value.
Thus, there is a need for a way to express these constraints using the same XML syntax in a declarative manner, using tools such as the Schematron, which was designed to extend the expressive power of XML Schema in specifying constraints. Schematron is a declarative assertion language using XML syntax developed by Rick Jelliffe, a member of the W3C XML Schema Working Group, and consists of a set of rules using XPath expressions, another W3C Recommendation, that can be used to specify relationships between different elements. It is rule-based in contrast to XML Schema, which is grammar-based. Schematron has radically different strengths to XML Schema and is in fact highly complementary.
A set of Schematron rules is written to express constraints that cannot otherwise be specified with XML Schema. This set of rules is transformed automatically through a meta-stylesheet to produce an XSLT stylesheet which can then be run against a given XML document to ensure that the constraints are satisfied. This is a well-known procedure and tools are available to perform this step.