1. Field of the Invention
The present invention relates to a database management system that converts data from a plurality of data sources in a variety of different data formats into a common format which can be accessed and searched via a common database interface and, more particularly, to a database management system that provides authenticated access to the common database interface via a web server and which enables the user to search the data across several data sets.
2. Description of the Prior Art
In recent years, the use of large and complex data sets has grown dramatically. This usage explosion has required users of the data sets to experiment with better access and management methodologies for a diverse and dynamic user environment. For several years, Wharton Business School has been managing large financial data sets with the SAS.RTM. System for delivering financial information in an academic environment. Since late 1995, Wharton has provided access to large financial data sets from a variety of data vendors using the SAS.RTM. System and the World Wide Web. This system is known as the Wharton Research Data System (WRDS).
Large financial data sets have been used for financial research for many years. The financial data sets widely used at business schools include market research data (such as CRSP, Fama and Market Indices), corporate data (such as Compustat), and banking and insurance data (such as BEST and FDIC). Prior to development of WRDS, the data sets were stored on large VMS/VAX systems and users had to run FORTRAN programs to analyze or extract data. Desktop tools such as Systat and Excel were also available, but working with the data using these desktop tools required that the user be familiar with the formats of the data sets, FORTRAN programming, mainframe to PC file transfer techniques, the VMS operating system, and the data import format of the desktop software. Such systems were cumbersome, difficult to support, and slow. Moreover, changes in data format required updating many programs written to index the data.
To avoid the limitations of specialized management programs, commercial database management systems such as FAME, DART, and Intelligent Query were developed. While these systems provided good data manipulation tools, they generally lacked strong analytical tools and were not suitable for time-series financial data. Also, extensive programming was required to convert the wide selection of data sets used in conventional database researching.
Accordingly, WRDS was developed to use SAS.RTM. (and SAS/ASSIST.RTM.) to extract and analyze the data, to manage data sets centrally while providing network access to the complete series of data on UNIX systems, and to provide X-Window access to UNIX systems. SAS.RTM. provided a single, unified tool for data management and analysis and has proven to be much more efficient than conventional FORTRAN programming techniques. Moreover, because the same data tool was used for all data sets, users of WRDS could easily analyze data across different SAS.RTM. data sets.
Unfortunately, access to WRDS was limited by its VT100 interface for those accessing the data from a remote location. Thus, it was desired to connect WRDS to the Internet so that users could select the desired financial data via the Internet. However, since the data sets were proprietary and were generally purchased from vendors, the contents of the data sets could not be released to the general public via the Internet. As a result, two web servers were connected to WRDS: a World Wide Web server for serving the worldwide community, and an Intranet server for serving the Wharton community.
The Intranet could be accessed using conventional UNIX authentication techniques. However, authenticated access to WRDS via the Internet is much more problematic, for in the UNIX environment, user authentication for Internet access is very complex if the Netscape default database authentication scheme is not used. Since distributed computing systems typically have accounts on the respective machines while the Web servers are centrally managed, using Netscape's default authentication scheme will generally require the users to take out another account and to manage another password. A customized unified authentication scheme was thus developed to enable a Netscape server to query the distributed computers for verification; however, that customized authentication scheme required countless hours of programming Netscape's application programming interface (NSAPI).
Accordingly, an improved authentication technique is desired that allows databases such as WRDS to be accessed via the Internet using an authentication code that can be easily verified without requiring an additional account management system or significant amounts of customized software.
Also, web browsers which submit search queries via the World Wide Web typically wait for the search process to finish and timeout if the search is not completed within a set period of time, such as five minutes. However, searching large data sets using sophisticated data queries may take longer than the set time and the system will timeout without the user getting the requested data. The user is also prevented from using the web browser for other functions or from logging off the World Wide Web until the search process is completed and the results returned. An off-line method for completing search queries initiated via the World Wide Web would greatly facilitate the searching of large databases, such as WRDS, accessed via the World Wide Web.
The present invention has been designed to meet these needs in the art.