The present invention relates to data retrieval from a database system by an external process. More particularly, the invention relates to a method and system for retrieving data from a database system where the data retrieval from the database and data transfer to the external process are carried out concurrently by two separate cooperating processes.
The field of business applications of computer technology has seen many important changes over the last few years. With steadily growing computational power and data storage capacities of computer systems used for business data processing, the interest of the business community has shifted from relatively simple data capture and management systems towards more sophisticated business management systems. Early data management systems, as represented by databases of operational data managed by associated database management systems (DBMS), were mostly designed to support the day-to-day operation and business functions of small to medium businesses or low organizational levels of big enterprises. These early data management systems have evolved into enterprise resource planning (ERP) systems, combining databases across the whole enterprise and integrating at the enterprise level all facets and functions of the business, including planning, manufacturing, sales and marketing. Modern ERPs not only support decision making at the enterprise level, but also allow data sharing across organizational units, which greatly improves the information flow through the company.
Various software systems and packages supporting the ERP approach to business data management have been developed over the years. An example of such a software package is SAP R/3 System available from SAP AG (Germany) or its U.S. branch, SAP America, Inc. SAP (from Systems, Applications, and Products in Data Processing) is a modularized, table-driven system capable of working in both transaction and batch modes to perform specified functions, such as order processing, inventory control, financial accounting and planning, production planning and control, etc. SAP is designed to interface with terminals, printers, databases, or external communications facilities, such as IBM""s Virtual Telecommunications Access Method (VTAM).
Other systems and applications for business data management and analysis have also been developed over the last few years, such as the Business Intelligence suite of applications from Cognos Inc. (Canada). These applications provide various data analysis and visualization tools, including reporting tools, database navigation and exploration tools, data modeling tools, and data warehouse construction tools. They combine new ways of data analysis and presentation with methods for discovering hidden patterns and previously unrecognized relationships among the data, known as data mining.
Independently of the approach taken, all business data management and analysis tools rely for their operation on vast amounts of data stored in various databases. The organization of these databases usually depends on the system under which the databases have been created and are being maintained. Even though the relational model is almost universally followed for the data storage, some systems, in particular the SAP system, add an abstraction layer on top of the relational database, which layer provides additional meaning to the data and ensures data integrity and security, and insulating the user from the actual data container. Bypassing such a layer to access directly the data can normally be achieved, if at all, only under the threat of compromising the security and integrity of data and is almost never attempted.
Access to data stored in a database is normally achieved by using data querying facilities provided by the DBMS associated with the database, usually in the form of a query language processor. By sending a suitable data request or a series of such requests formulated in the query language supported by the DBMS, the requested data is retrieved by the DBMS and returned to the requester. A query language almost universally supported by current DBMSs is the SQL language, a formal database industry standard (the latest ANSI standard of the language known informally as SQL/99). SQL is a non-procedural data sublanguage concerned specifically with database objects and operations, such as data definition, manipulation, and control. Despite attempts to make it a computationally complete language, SQL remains at present mostly a special-purpose language for the creation and manipulation of data in relational databases.
Used as a query language for data retrieval from relational databases, SQL provides access to data stored under most but not all DBMSs presently on the market. In some database systems, access to data is possible only by using proprietary query languages other than SQL. An example of such a system is the earlier mentioned SAP system. SAP""s internal functions are performed using its own high level programming language, ABAP, a powerful and computationally complete fourth generation programming language with a tightly integrated data dictionary. By design, SAP provides only a carefully controlled ability to interface to modules written in languages other than ABAP, such as C++ language. Users may use ABAP (the latest version of which is ABAP4) to create their own specialized application functions and reports, for example to gain access to data stored in a SAP system. Interface tools are available in SAP to allow such application software to be interfaced to the SAP system. Of those, Remote Function Calls (RFC) service supported by the SAP system for both SAP to SAP and SAP to non-SAP communications is most frequently used. Even though the RFC service can be used to call a non-SAP function, such a call has certain limitations and disadvantages. In particular, when sending a data query to a SAP database from an external process via the RFC service, data cannot be returned to the external process at the same time as they are being retrieved from the SAP database, as any such attempt results in the loss of a database cursor, thus disrupting the data retrieval process. As a result, the external process has to wait until all the requested data is retrieved, before the data is returned to the process. Depending on the amount of data to be retrieved, this may cause substantial delays in completing the data query.
The proprietary nature of data storage and access makes data sharing difficult between different systems and applications. In the practice of business data management and analysis, however, such a need occurs frequently, for example, when the system under which data of interest is stored and maintained does not offer data management or analysis tools considered to be particularly advantageous to a data analysis task at hand, or when merging data from different systems is required. Using data stored under one system by another system or application capable of accomplishing a task at hand requires data retrieval from the first system and transferring the data to the second system or application prior to accomplishing the task. Such a data retrieval and transfer is particularly difficult if the first system does not support the query language in which data is requested by the second system or application. What is therefore needed is a system and method for a fast and efficient access to data stored in a database system supporting one query language, when data is requested by a system or application supporting another query language. The present invention provides such a system and method which are free of many prior art limitations.
In its broad aspect, the present invention provides a method and a system for fast and efficient retrieval of data from a database system accepting and processing data requests in one query language, when the data is requested in another query language by a process external to the database system. This is achieved by providing an interface module operatively coupled to the external process and the database system, which module upon receiving from the external process a data query in the first query language generates an equivalent data request in the second query language and sends the data request to the database system for processing and data retrieval. The interface module then receives the retrieved data from the database system and returns the data to the external process.
In a preferred embodiment, the interface module processes simultaneously multiple data queries and operates as four separate processes which communicate with one another to coordinate their actions. The first process is the highest level process responsible for receiving data queries in the first query language from the external process, generating corresponding data requests in the second query language supported by the database system, and returning the retrieved data to the external process. The first process also provides data processing functions supported by the first but not by the second query language. If such functions or operations are included in the data query received from the external process, the first process provides an appropriate processing of the data retrieved from the database system, prior to returning the data to the external process.
The data requests generated in the second query language by the first process are forwarded to a second, lower level process, which communicates these requests to the database system. If the amount of data retrieved does not exceed a predetermined amount, the second process receives the retrieved data from the database system and forwards it to the first process. If the amount of data exceeds the predetermined amount, data is compressed by the database system upon retrieval and written to files on the database system server. The compressed data is then retrieved from the files by a third process and sent to a fourth process for decompression. Data decompressed by the fourth process is then forwarded to the first process.
It is an important feature of the present invention that the four processes of the interface module, while communicating with one another to accomplish the overall goal, operate independently of one another and at least in part concurrently. For example, the fourth process may be decompressing data received from the third process and sending data to the first process, while the third process is receiving data from the database system and sending it to the fourth process. This concurrent operation is in part due to the fact that the flow of data between a data-forwarding and a data-accepting process of the interface module is taking place through a pair of alternating buffers, under control of read/write semaphores. As soon as one buffer is filled with data by the data-forwarding process, the buffer is read and its content processed by the data-accepting process, while at the same time the second buffer is being filled with data by the data-forwarding process. It is another important feature of the present invention that the retrieval of data from the database and the transfer of data from the database to the interface module is carried out concurrently by two separate processes. Combined with other features of the interface module, this results in a fast and efficient transfer of data from the database system to the external process.
Thus, according to one aspect of the invention, there is provided a method for retrieving data from a database system by an external process, said external process issuing data queries in a first query language, said database system accepting and processing data requests in a second query language, which method comprises the steps of: providing an interface module operatively coupled to the external process and the database system, said interface module generating an equivalent data request in the second query language when receiving a data query in the first query language; sending a data query in the first query language from the external process to the interface module; generating in the interface module at least one data request in the second query language, said data request being equivalent to the data query sent by the external process in the first query language; sending the generated data request from the interface module to the database system for processing and data retrieval; transferring the retrieved data from the database system to the interface module; and returning the retrieved data from the interface module to the external process, wherein the retrieval of data from the database system is carried out concurrently with the transfer of the retrieved data from the database system to the interface module.
According to another aspect, the invention provides a system for retrieving data from a database system by an external process, said system comprising: means for running the external process, said external process issuing data queries in a first query language; means for hosting the database system, said database system accepting and processing data requests in a second query language; means for running an interface module, said interface module operatively coupled to the external process and the database system, said module capable of: generating an equivalent data request in the second query language when receiving a data query in the first query language from the external process; sending the generated data request to the database system for processing and data retrieval; transferring the retrieved data from the database system; and returning the retrieved data to the external process, wherein the retrieval of data from the database system is carried out concurrently with the transfer of the retrieved data from the database system to the interface module.
Other advantages, objects and features of the present invention will be readily apparent to those skilled in the art from the following detailed description of preferred embodiments in conjunction with the accompanying drawings and claims.