1. Field of Use
The present invention relates to systems and methods for monitoring information accesses and more particullrly the usage of a data warehouse and the information contained therein.
2. Prior Art
Data warehouses are becoming more and more important to businesses. The term xe2x80x9cdata warehousexe2x80x9d is generally used to describe a database containing data that was gathered from a variety of sources (e.g. existing production databases). For more information regarding the nature of a data warehouse, reference may be made to the article entitled, xe2x80x9cData Warehousing: An Introductionxe2x80x9d by Grayce Booth which appeared in the May/June 1995 issue of the Bull S. A. technical journal entitled, xe2x80x9cTechnical Update.xe2x80x9d
Typically, the data warehouse is implemented as a large amount of data stored in a database with access to the data coming from hundreds of users executing commodity applications like Excel, running on personal computers (PCs). Here, an opportunity for a business exists to manage the data warehouse system. It is useful to the warehouse owner to have information and statistics about the usage of the warehouse and its data. Such information includes: (a) how many users are currently logged onto the system; (b) what is the pattern of access statistics; (c) what data is accessed most frequently, (d) what if any indexes could be added or dropped to improve access efficiency; (e) what if any unlawful access attempts have occurred; and (f) what query runs the longest. Some of this information can be obtained from the warehouse database system but each type of database gathers this information in a different proprietary manner. Therefore, there is an opportunity to be able to provide usage data in a standard fashion for all database types. Also, there is the ability to provide the information through standard system management tools based on standard protocols, such as the Simple Network Management Protocol (SNMP).
As well known in the art, the Open Data Base Connectivity (ODBC) application programming interface is a standard defined by Microsoft Corporation by which Windows based tools and applications may access different databases on many different server platforms. Many PC vendors have adopted ODBC. Without using ODBC, applications are required to use APIs specific to a database vendor for accessing data warehouse information. Using ODBC, an application may access any type of database. In addition, ODBC is used by application tools such as EXCEL such that specific code is not required for each database type being accessed.
Client/Server ODBC is a newer technique for implementing ODBC. The interface to ODBC for user applications remains on the PC but the bulk of the ODBC logic is moved to a server side implementation. All PC users execute their data requests through a common ODBC server. This arrangement provides a xe2x80x9cthinxe2x80x9d client requirement for the PC user of ODBC and makes the administration of ODBC possible from a single server. This single point of access through the ODBC server also provides the opportunity for administrating the data warehouse. All clients PCs that need to access the data warehouse come through the single point of access (Le. ODBC server).
Although ODBC provides a common PC based API, each relational database management system (RDBMS) vendor typically has implemented a unique interface for data access. To adapt tools based on ODBC to the interfaces used by various types of RDBMS, Microsoft Corporation specifies the development of a xe2x80x9cdriverxe2x80x9d. The driver transforms the ODBC API standard calls into RDBMS specific calls. The use of ODBC provides a layer of consistency above each of the APIs implemented by the RDBMS vendors. In the prior art, a separate ODBC driver was required for each type of RDBMS to be accessed. Additionally, each database vendor typically, requires a tailored communications link An improvement to this approach is to provide a single data access (DDA)ODBC driver to replace multiple customized ODBC drivers with a single implementation that can access multiple types of databases.
An example of the above type of system is the distributed data warehouse (DDW) middleware described in the article entitled, xe2x80x9cThe Distributed Data Warehouse Solutionxe2x80x9dby Kirk Mosher and Ken Rosensteel that also appeared in the above referenced May/June 1995 of the Technical Update Journal. This system utilizes a proprietary based infrastructure called DDW/NET that works in conjunction with the DDW/ODBC driver. DDW/NET enables connections to multiple computer architectures, operating systems, and network protocols. The DDW/NET software resides on each of the legacy and server systems that communicate over standard communications links and hides the details of networking from the upper layers of software on each system.
The above prior art system included several features to aid the administrator of the data warehouse. Such features included an SNMP agent that monitored the activity of the distributed data warehouse (DDW) processes and users of the data warehouse and a Usage Monitor facility that recorded SQL database queries issued by individual users. Each of these features required the use of an interface to the DDW Net on a UNIX based platform to help gather the required information. This approach required the use of proprietary interfaces that made it difficult to expand the types of databases used by the system. The data that was needed was not easily accessible from the DDW Net memory. DDW Net design was based on Ingres technology, that could not be easily enhanced. This prior art approach is described in the publication entitled DDW Administrator""s Guide, dated Apr. 25, 1997, copyright Bull S. A. and Bull HN Information Systems Inc. 1995, 1996, 1997, Order Number 86 A2 83FC Rev4.
Accordingly, it is a primary object of the present invention to provide a system and method for facilitating monitoring of data warehouse activity.
It is a further more specific object of the present invention to provide an interface arrangement that simplifies data warehouse monitoring through standard protocols.
The above objects are achieved in a preferred embodiment of the present invention that provides a special application programming interface (API) that provides interoperability between standard protocols utilized in conjunction with the monitoring and administration managing tool components of a data warehouse system. One protocol is the well known data connectivity protocol Open Database Connectivity (ODBC), that. defines a standard interface between applications and data sources. Another protocol is the well known network management protocol Simple Network Management Protocol (SNMP) that defines a standard interface between an agent component and a network management system.
In the preferred embodiment, the warehouse components include a local SNP agent component for gathering data pertaining to the activity of a distributed data warehouse (DDW) processes and the users of the DDW system and a usage monitor component for tracking statistics about the different types of SQL queries issued by individual system users. According to the present invention, the warehouse components further include ODBC server and driver components for operatively connecting to the DDW system target warehouse database for processing SQL queries submitted by warehouse knowledge workers. The ODBC server component also operatively couples to an SQL log that it uses to maintain entries pertaining to user SQL queries it receives from a number of ODBC client user systems. The usage monitor component operatively couples to the SQL log and performs the function of gathering data from the entries that it uses to populate tables of a usage monitor database that it maintains for providing usage statistics.
The SNMP agent component performs further monitoring functions. The component operatively couples to the ODBC server component through the special API that enable such components to have access to a variety of types of information received from the ODBC server through the ODBC protocol and reportistore such information in a MIB database of a further warehouse component that corresponds to a centrally located SNMP server component via the SNMP protocoL
In accordance with the present invention, the special API provides the following types of information: Server Listen Address and Number of Active Connections. For each active connection, the Connection ID, login time, number of messages sent, number of messages received, number of bytes received, last message and last message direction. Additionally, the API provides other configuration information relevant to the server such as network ports used and server name. The special API is used by the local SNMP agent component to gather real time information about data warehouse usage and reports that information to the centrally located manager server unit.
In accordance with the teachings of the present invention, the ODBC server component operatively records entries having a predetermined format (e.g. ASCII format) into two log files. Entries for every user login to the ODBC server component are recorded in the first log file. Every SQL statement sent to the ODBC server component and information identifying the user that issued the statement, the time of execution, the elapsed time of the etc. is recorded in the second log file. The usage monitor component periodically reads the second log file and writes the statistics about usage into the usage monitor database. By using the ODBC serve information, no software components need be inserted between the end users and the data warehouse to gather the usage information. Since the OD
The above objects and advantages of the present invention will be better understood from the following description when taken in conjunction with the accompanying drawings.