Chemical and pharmaceutical industries and chemical-related government agencies commonly maintain large chemical substance databases. These entities often provide structure-searching capabilities in association with such databases. Recently, these organizations have been standardizing their databases using relational database management systems (RDBMS) such as the Oracle Relational Database Management System by Oracle Corporation, World Headquarters, 500 Oracle Pkwy., Redwood Shores, Calif. 94065.
The advantages of integrating chemical structure information into an RDBMS include: a closer integration with other related chemical data, efficiency in both storage and retrieval of chemical structure data, and better access to the chemical structure data by other related applications.
Unfortunately, chemical information systems have traditionally been built using specialized database technology requiring, in many cases, hundreds of thousands of lines of custom computer code. Systems of this type are often both difficult to maintain, and difficult to adapt to changing hardware technologies. These maintenance problems, coupled with a lack of portability of these highly specialized systems, often lead to large investments of time and money being allocated to relatively short-lived systems.
The introduction of relational database technology provides an opportunity to transfer a large amount of the database management responsibility from the specialized database systems described above to a standard widely-accepted technology. However, relational technology has typically not been used as the basis for chemical information systems. This is due to the fact that there are problems inherent in any attempt to cast a chemical structure searching system problem into structured query language (SQL)--the standard language of relational databases. These problems include difficulty in storing and representing chemical structures in a database. No chemical information system has yet been implemented using only relational technology as its database component.
Several systems have attempted to achieve this goal but, as more fully explained below, none have been able to develop a purely relational database management system which is able to search and retrieve chemical structure information easily and quickly.
For example, Molecular Access System (MACCS) and Integrated Scientific Information System (ISIS) are both created by Molecular Design Ltd., MDL Information Systems, 14600 Catalina Street, San Leandro, Calif. 94577. These systems provide a stand-alone chemical information system wherein chemical structures are stored as hierarchical structures. However, these systems require large amounts of custom code, and are not maintained in a relational database. Accordingly, they do not have the advantages of relational technology listed above.
While it is true that these systems can be interfaced to a relational database management system such as the Oracle Database Management System noted above, it must be done using additional custom code and software that converts hierarchical structures to the relational tables needed for such a database. Therefore, it is difficult to incorporate the advantages of relational technology into the MACCS and ISIS systems. Moreover, the conversion software slows down overall performance speed.
In summary, these systems do not provide the advantages and capabilities existing in the present invention.
The present invention overcomes the above-listed problems and additionally has the following advantages: (1) development and maintenance costs will be greatly reduced by using a commercial database package. Accordingly, development efforts and benefits can be more effectively directed toward aspects of system design, and improvements in the underlying database technology will be automatically transferred to the chemical information system. This shift of focus away from database development concentrates the development and maintenance efforts on improving the search strategy and the user interface, which are the highly visible aspects of the system; (2) interfacing with other information systems will be simplified since relational databases are already used to store much of the non-structural chemical data used in research and commercial settings; and (3) portability will be much less of a design drawback since the amount of custom programming is minimal and can easily be adapted to numerous types of technology. Therefore, the portability responsibilities are mostly shouldered by the database manufacturer itself, and not by the developer of the chemical storage system.