Today, clinical test data and such are generated using ORACLE or SQL databases, and are updated daily. Such data, however, lacks openness, which poses a problem of difficulty in transfer and expansion of a data system. Hence, the major trend of data format is now gradually shifting to XML data having superior openness.
International Publication Pamphlet No. WO 2006-123448 discloses an information retrieval program for carrying out compression, encoding, and full-text retrieval of HTML format content.
If data having a complicated structure, such as clinical test data, is converted into XML data, the resulting XML data includes a large amount of tag information and has a file size several times to 20 times as large as the original file size. When such an XML file is to be searched, XML tag character strings are longer than the numerical value or character string to be retrieved, which is an obstacle that deteriorates retrieval performance.
FIG. 56 is an explanatory diagram of XML data related to clinical test data. For example, when the initials “T.C” of a patient name is to be retrieved from XML data representing clinical test data, an XML start tag <patient_initialxml_title=> and an XML end tag </patient_initial> for the initials are searched for. Such search is an obstacle that deteriorates retrieval performance.
Although clinical test data includes character strings that may be identical, each character string has various points of significance such as pharmaceutical efficacy and side effects, which are identified by searching for the above XML tags. Search for an XML tag is, therefore, essential and is an obstacle that deteriorates retrieval performance.
Similarly, although clinical test data may include numerical values that are identical, each numerical value may signify a variety of things, such as body weight, age, and blood-sugar level, which are identified by searching for the above XML tags. Search for an XML tag is, therefore, essential and is an obstacle that deteriorates retrieval performance.
As described, the types of XML tags are many and complicated, thereby increasing the size of each data item. Particularly, when multiple data formats are integrated to combine clinical test data into a single XML file, the number of XML tags increases, making the file enormous in size. This leads to a problem of deterioration in retrieval performance.
Further, as clinical test data is frequently added and deleted, maintenance of the integrated files consumes a huge amount of time. Although information such as clinical test data is used for analysis, the information is also equivalent to personal information, bringing about a need to prevent access to the information by persons other than the analyst.