1. Field of the Invention
The present invention generally relates to computer database systems. More particularly, the invention relates to methods for composing and processing a database query that includes generating event duration episodes from a collection of individual data samples.
2. Description of the Related Art
Computer databases are well known systems used to store, maintain, and retrieve data. Generally, a database provides a collection of data that is organized in a manner to allow its contents to be accessed, managed, and updated. The most prevalent type of database used today is the relational database, which organizes data using relationships defined among a group of tables. For example, the DB2® family of RDBMS products (relational database management system) available from International Business Machines (IBM) provide a sophisticated commercial implementation of a relational database.
Tables in a relational database include one or more columns. Each column typically specifies a name and a data type (e.g., integer, float, string, etc.), and is used to store a common element of data. For example, a table storing data related to patients may reference each patient using a patient identification number stored in a “patient ID” column. Data from each row of such a table is related to the same patient, and table rows are generally referred to as “records.” Tables that share at least one element in common (e.g., the patient ID column) are said to be “related.” Additionally, tables without a common data element may be related through other tables that do share such elements.
A relational database query may specify which columns to retrieve data from, how to join columns from multiple tables to form a query result, and any conditions that must be satisfied for a particular data record to be included in a query result set. Current relational databases typically process queries composed in an exacting format specified by a query language. For example, the widely used query language SQL (short for Structured Query Language) is supported by virtually every database product available today. An SQL query is composed from one or more clauses set off using specific keywords. Composing a proper SQL query, however, requires a user to understand the structure and content of the relational database (i.e., a schema of tables and columns) as well as the complex syntax of the SQL query language. This complexity often makes it difficult for average users to compose relational database queries.
Database records typically capture a snapshot of a data value recorded for a particular point in time. For example, lab test samples may be recorded with the date and time when the tests are performed or when results are received. Thus, a database record may accurately reflect that a particular patient's hemoglobin test result was “16” on the 5th of December at 4:15 pm. Some events, however, have a duration period associated with them, or may continue over a period of time. For example, a patient's home address may be valid over a period of weeks, months, or years. The records in a database table may only capture this information using a sampling of the address taken at various points in time. For example, each time a patient visits a doctor's office, the patient is usually asked to confirm or change their address. If an address has changed, a new address may be recorded in the database. However, an ambiguity now exists. Time between office visits may span days, weeks or even years. Once a patient identifies a new address during an office visit, then for the duration from the last office visit to a current one, the clinic may not have any reliable information regarding whether the patient's address changed the day after the last visit, the day before the current visit, or any number of times in between.
At the same time, knowing when (or for how long) an individual may have lived at a specific location may be useful for certain research issues. For example, causation is a major area of medical research, and determining the cause of a condition often requires analyzing when and for how long certain events occurred. Thus, an individual's home address may be useful to help determine or identify the causes of conditions that are aggravated by environmental factors. Accordingly, users may wish to compose a database query with a logically simple condition such as “state residence=Minnesota.” However, directly evaluating conditions such as this may not be possible, as the period (or episode) during which an individual lived at a specific address is usually only sampled at certain points of time, without any indication of when an individual may have moved from one address to another. As a result, composing a database query that will return records based on these types of intuitive conditions is often too complex for an average user. Moreover, even skilled users must switch their focus from analyzing a particular problem to figuring out how to compose a query that will retrieve the desired information from the database.
Accordingly, there is a need for a database query application that allows users to more easily compose a database query that includes conditions based on duration-based episodes, even though a database may only record a collection of snapshots, or samplings, that reflect the value of the episode captured for a particular point in time.