1. Field of the Invention
The present invention generally relates to computer systems and computer databases, and more specifically to a software engine that may be used by various application programs as the primary interface for viewing, searching, and modifying information stored in databases, particularly relational databases which are used to maintain electronic catalogs.
2. Description of the Related Art
Computer systems use databases to maintain information for a wide variety of purposes. A database may generally be thought of as a file composed of one or more records, each containing fields, together with a set of operations used for searching, sorting, recombining, and other functions. Data in a typical relational database can be represented as a set of records (entries) spread out across multiple tables. Entries in a table are comprised of values for several data columns. The database has a definition for each table which dictates the format of records in the database, including the number of fields, specifications regarding the type of data that can be entered in each field (e.g., numeric or alphabetical), and the field names used. A database is physically stored on any conventional computer storage medium, such as a hard disk drive. A database server (a network node or workstation) may be dedicated to storing and providing access to a shared database.
Applications can be custom-written to access a database having a particular schema (the format of the database). Alternatively, a database engine (a program module or modules) is often used to provide access to a database management system. The database management system (DBMS) is the software interface between the database and the user. A database management system (or database manager) handles user requests for database actions, and allows for control of security and data integrity requirements.
One important use for databases is in electronic catalogs. With the explosive growth in the number of users of the Internet, particularly the World Wide Web (WWW), many retailers are taking advantage of the opportunity to present goods and services to consumers via this new purchasing medium (e-commerce). Online catalogs are often represented as a hierarchy or tree wherein each level of the hierarchy represents a slightly more specific classification about particular products If available in the catalog. For example, products from an automobile dealership might first be categorized according to the particular vendor (manufacturer), then according to a particular body style, and then according to a particular color. Information regarding all of the available cars/trucks accessed through a database by the web server or application server, and updated periodically. Users are able to search the catalog using limited search criteria based on the available fields.
As suggested above, and as illustrated in FIG. 1, a web application 2 may be customized to directly access a catalog database 4. In this situation, the web application is programmed based on the knowledge of the specific database structure of the catalog database, i.e., the number of total database files in the overall database 4, and the fields in each of the database files. Web application 2 can include a search function that formulates a request based on user inputs, and generates a query to catalog database 4 according to a static list of attributes.
As further suggested above, and as illustrated in FIG. 2, a web application 6 may be designed to access a catalog database 8 via a catalog engine 10. In this situation, web application 6 need not be programmed according to the specific database structure of catalog database 8. Rather, web application 6 is only required to interface with catalog engine 10, which is then responsible for interacting with the database. For example, a conventional catalog engine might be used to carry out search operations by generating a SQL query. A SQL query uses structured query language, a database language that has become the standard for most database products. Catalog engine 10 may be used by other applications as well, such as another web application 12 (or a Java applet, etc.).
One major problem with these electronic catalog systems relates to the fixed manner in which the application, or the catalog engine, interfaces with the catalog database. The structure (schema) of the database may need to be changed, in order to optimize it in a particular way, or to add new fields and capabilities. For example, FIG. 3 is a representation of the conversion of information pertaining to an automobile dealership""s inventory, from a first schema with one table 14, to a second schema with three tables 16, 18 and 20. In the first schema, the single table 14 contains all the information (that is, all fields) pertaining to each record (car/truck). In this simplified example, a given record has seven fields: Year (of manufacture); Mfr. (manufacturer); Doors (the number of doors); type (the automobile body design); model (a brand name); color, and VIN (vehicle identification number).
It can be seen that the structure of the single table 14 lends itself to certain inefficiencies. While sufficient when considering a single record, it becomes extremely inefficient when looking at all of the data as an aggregate. For example, much of the stored data is redundant. The first six entries of table 14 show how the corresponding six cars are all year 2000 Ford 4-doors. The first five entries are furthermore all Taurus models. It is possible to reduce this inefficiency by creating multiple tables which are interrelated, and which effectively reuse the redundant information. In the second schema, a Base table 16 is used to collectively identify all vehicles according to their year, manufacturer, and model. A Base ID (identification) value is then assigned to each record. This Base ID is used to correlate those records with entries in a Style table 18. In this example, the first four rows of Style table 18 all have a Base ID of zero which, according to Base table 16, corresponds to any vehicle that is a year 2000 Ford Taurus. Thus, the information that pertains to the fields in Base table 16 does not have to be repeatedly included in Style table 18, reducing the redundancy. Style table 18 adds information pertaining to the number of doors, vehicle type, and color, and then assigns a Style ID to each record. The Style D is then used to access the corresponding VIN in VIN table 20. This correlation between the tables of the second schema is the essence of a relational database.
The static nature of prior art catalog applications and catalog engines makes it difficult to accommodate such changes in a database. In most cases, it will be necessary to rewrite the application or engine, which can be extremely laborious (or impossible for the end user that may not have access to the application""s original code). Some prior art catalog engines can create different classifiers for different attributes, but these classifiers must conform to a fixed database format, and changes nearly always necessitate extensive import/export operations. These problems are particularly troublesome for volatile databases (whose structure might change often). It would, therefore, be desirable to devise a more flexible catalog engine that can easily support changes in the catalog database schema. It would be further advantageous if the improved catalog engine were highly scalable for very large database applications.
It is therefore one object of the present invention to provide an improved classification system.
It is another object of the present invention to provide such an improved classification system that easily accommodates changes in the underlying database schema.
It is yet another object of the present invention to provide an improved method of constructing an electronic catalog using such a classification system having schema independence, which imparts performance increases when the database is optimized for specific situations.
The foregoing objects are achieved in a method of managing a computer database (e.g., a database for an electronic catalog) generally comprising the steps of importing data into a database residing on a computer system, constructing a schema object to represent the database schema, and using an aggregate classifier to manipulate the database. The schema object may be constructed by defining a plurality of classifier definitions corresponding to specific columns and tables in the database. The classifiers may include: a xe2x80x9cpropertyxe2x80x9d classifier which interacts with a single column on a single table; an xe2x80x9cobjectxe2x80x9d classifier which contains one or more of the xe2x80x9cpropertyxe2x80x9d classifiers; a xe2x80x9csplit-objectxe2x80x9d classifier which makes more than one xe2x80x9cobjectxe2x80x9d classifier appear as a single classifier; a xe2x80x9cjoinxe2x80x9d classifier which identifies how multiple database objects are linked in a xe2x80x9csplit-objectxe2x80x9d classifier; and a xe2x80x9cmapped propertyxe2x80x9d classifier as a special form of the xe2x80x9csplit-objectxe2x80x9d classifier to manage data stored in a table of the database which serves as an index to another database table. Parameterized classifiers may also be defined which are templates for classifiers that are instantiated when associated parameters are provided.
The invention allows the classification system to modify the database structure and easily conform the classification engine to the modified structure without recompiling the engine or rewriting the catalog application. The engine is conformed to the new structure by constructing a second schema object for the modified database. The schema objects are preferably constructed from classifier definitions that are defined using a field-based language such as extensible markup language The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.