All of the material in this patent application is subject to copyright protection under the copyright laws of the United States and of other countries. As of the first effective filing date of the present application, this material is protected as unpublished material. However, permission to copy this material is hereby granted to the extent that the copyright owner has no objection to the facsimile reproduction by anyone of the patent documentation or patent disclosure, as it appears in the United States Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
Not Applicable
1. Field of the Invention
The invention relates to improvements in data definitions as applied to relational databases. More particularly, the invention relates to encapsulation methods of user data to eliminate database schema changes whenever the presentation of the user data changes.
2. The Prior Art
FIG. 1 illustrates a functional block diagram of a typical prior art telephone network for bulk calling using analog phone lines to a Public Switch Telephone Network (PSTN). Several Customer Telephones 102, are coupled to the PSTN 104. A plurality of Operator Stations 110 with a Server Database 108 connected to a Network 106 is shown. Results from customer directory assistance inquires which are retrieved from the Server Database 108 are displayed on the appropriate Operator Station 110. At each Operator Station 110, an Operator Telephone 112 is connected to the PSTN 104. The connection between the Operator Telephone 112 to the PSTN 104 is over analog lines 114. An Operator sitting at an Operator Station 110 uses the Operator Telephone 112 connected to the PSTN 104, to perform directory information search requests on the Server Database 108. In some embodiments, the number in the database is automatically dialed from the Server Database 108 for the Operator sitting at the Operator Station 110. The Operator Stations 110 comprise a workstation, a dumb terminal or a generic information processing unit running Windows, DOS, Linux, Macintosh or equivalent. The Server Database 108 comprise any of a number of machines similar to those mentioned for the Operator Station 110 but that responds to data queries from the Operator Station 110. On a periodic timetable, directory assistance utilization information from the Operator Stations 110 is gathered into a Relational Database 116. For the purposes of the disclosure, both the Operator Station 110, the Server Database 108, and the Relational Database 116 are to be known as xe2x80x9cinformation processing unitsxe2x80x9d. The Operator Station 110 comprising the aforementioned information processing units are used for a variety of purposes. These include database information storage, web based audio visual communication, system control, directory assistance utilization information storage and other generic processing functions. Relational Database 116 is used to report information about directory assistance utilization. This information is variable and subject to change. Requirements for the information to be collected and how it is reported vary by customer. A problem with the prior art is that it is difficult and expensive to change and to customize the information being collected and the reports that present the information.
FIG. 2 is a block diagram 200 of call interaction between a Caller 202 using Customer Telephone 102 and an Operator 204 using the Operator Station 110 and PSTN of FIG. 1. A call is an interaction between a Caller 202 and a directory assistance Operator 204. During the call, the Caller 202 can request one or more telephone number listings. Typically, the Operator 204 may perform one or more searches of Listing Database Tables 206 in Server Database 108 for each request. The call is terminated when the caller hangs up or is transferred to another operator.
FIG. 3 is a block diagram 300 of prior art call statistics generation after each operator call using the PSTN of FIG. 1. Call Statistics are generated after each call has terminated in FIG. 2 and are stored in a temporary repository on the Operator Station 110. When a data collection Instance occurs, the interim Call Statistics are gathered and summarized into a Call Statistic Record 302 which is comprised of Organizational Type Fields 308 and Call Data Type Fields 310, then sent to a Central Data Collector 304 which processes the data collections into a Relational Database Table 306 in Relational Database 116 so that they can be archived, analyzed, and displayed.
The Organizational Type Fields 308 in the Call Statistic Record 302 are generally composed of two subtypes: 1) Business Organizational Type Fields 400; and 2) Time Organizational Data Type Fields 414. FIG. 4A is a block diagram 400 illustrating an exemplary prior art hierarchy of Business Organizational Type Fields of the directory assistance call statistics generated in FIG. 3. In this example the Operator 402 is part of a physical or geographic entity or Unit 404, say Miami and the Unit 404 is part of a Complex 404 such as a regional telephone company, such as Bell Atlantic.
A prior art time hierarchy of Time Organizational Type Fields 414 for directory assistance call statistics generated in FIG. 3 are shown in FIG. 4B. Data Collection Instance 408, Session-Shift 410 and Year-Month-Day 412 fields are illustrated.
The Call Data Type Fields 310 in the Call Statistic Record 302 are generally composed of two subtypes: 1) Call Data Numeric Information Type Fields 500; and 2) Call Data Activities Type Fields 512. FIG. 5A is a block diagram illustrating exemplary prior art Call Data Numeric Information Type Fields of the directory assistance call statistics generated in FIG. 3. Call Data Numeric Information Type Fields 500 for a directory assistance application represent numeric information such as Count 502 and Duration 504.
The FIG. 5B is a block diagram illustrating exemplary prior art hierarchy of Call Data Activities Type Fields 512. Activities shown are Search 506 within Request 508 within Call 510.
FIG. 6 shows a prior art Call Statistic Record 302 generated during a data collection instance for an Operator Station 110. The Business Organization Type Fields 400 of the Complex 406, the Unit 404 and the Operator 402 are shown as inputs to the Organizational Type Fields 308. The Time Organizational Type Fields 414 of the Year-Month-Day 412, the Session-Shift 410, and the Data Collection Instance 408 are shown as inputs to the Organizational Type Fields 308. The Call Data Activities Type Fields 512 of the Call 510, the Request 508, and the Search 506 are combined with The Call Data Numeric Information Type Fields 500 of the Count 502 and the Duration 504 as inputs to the Call Data Type Fields 310 for Call Statistic Record 302.
The user interface at the Operator Station 110 is very flexible and customizable. The requirements for the directory assistance utilization information to be collected vary by customer and can change as the interface at the Operator Station 110 is modified. At each instance of data collection, the directory assistance utilization information reflects activities that were performed in that timeframe and thus are a subset of all possible activities.
The implementation of the directory assistance utilization information data collection has many shortcomings. The format of the directory assistance utilization information collected by the Operator Station 110 is a single fixed length record containing a series of text and numeric quantities. What these quantities represent cannot be determined by inspecting the record itself. The record format is also designed to contain values for all possible information that can be collected. Frequently each record generated at a data collection instance will contain zeros where the information did not exist. At each data collection instance, each quantity is stored into a predetermined position in the record by software on the Operator Station 110. Each time new information needs to be collected, both the software on the Operator Station 110 and the record format must be changed to add the new data.
The implementation of the Relational Database 116 for directory assistance utilization information is similar to the implementation of the directory assistance utilization information data collection on the Operator Workstation 110. The Central Data Collector 304 contains software that extracts each quantity from each predetermined position in the record and stores that quantity in a predetermined field name in the Relational Database 116 schema. Each time the record format is changed on the Operator Station 110, the software on the Central Data Collector 304 must change to accept the new format and extract the new data. The schema of the Relational Database 116 also needs to have new fields added into tables.
The presentation of the directory assistance utilization information stored in the Relational Database 116 in runtime SQL queries and stored reports suffers from additional flaws. The data stored in the schema represents raw information. Additional knowledge is required if a user wishes to do further processing of the information in runtime SQL queries. Data can easily be misrepresented. The stored reports are complex. Raw information is retrieved from the database and then manipulated in compiled program logic that differs for each report. Each time the schema of the Relational Database 116 needs to be modified to add new fields, extensive modifications to numerous stored reports by a programmer with knowledge of the new data being collected is required. In order to validate that the modifications to the stored reports are correct, a comprehensive set of test cases must be developed and then executed in a functioning test environment. This can require a significant amount of test equipment and resources.
In order to rollout changes to the record format, Relational Database 116 schema, and stored reports, a costly and time consuming migration process is required. This process is difficult to implement on a running customer system without sacrificing data integrity. During the migration, the customer system must be able to keep processing calls. The Operator Stations 110 must be migrated in phases, since a shutdown and restart is required to start using the new software. Since for a period of time, both the old and the new record formats will be used, the Relational Database 116 must be migrated to the new schema and stored reports first. It will need to be shutdown during the migration in order to save the old data, convert to the new schema, then convert the old data and store it into the new tables. While the Relational Database 116 is not operational, the Central Data Collector 304 must temporarily gather the data that is still arriving in the old format. When the Relational Database 116 migration is complete, before it can be restarted, the Central Data Collector 304 must be shutdown so that its software can be modified to accept both types of record formats. All data received in the old format needs to be converted to the new format with zeros to reflect new data that was missing in the old format. The data stored temporarily when the Relational Database 116 was shutdown must also be converted to the new format. The Central Data Collector 304 can then be restarted, and then the Relational Database 116 can be restarted. The potential for data loss exists while the Central Data Collector 304 is shutdown during its migration.
After all the Operator Stations 110 are migrated, the software on the Central Data Collector 304 can be modified to accept the new format only. Again the Central Data Collector 304 will require a shutdown and restart with the potential for data loss during that time. Data loss occurs if the Operator Workstations 110 send data to the Central Data Collector 304 while it is not able to receive it from its buffers, and the buffers become full.
Accordingly a need exists for a new data definition design that overcomes the problems in the prior art and moves away from data specific layout so that migration problems with hard coded database definitions can be minimized, and the capability to customize a base installation can be provided.
Briefly according to the present invention, a method for encapsulating the form and function of user data in a relational database that eliminates database schema changes whenever the presentation of the user data changes is disclosed. The method comprises receiving source data, associated source data definitions and report layouts consisting of report headings and report logic. The source data definitions and the report layouts are parsed into three or more user data types: (1) an organizational type fields identifying a data collection; (2) value type fields containing data values in the data collection; and (3) procedural type fields containing one or more form keywords and function keywords. After the source data is received and parsed, the method includes the steps of: creating one or more form keywords for classifying and describing the data values that can be collected and presented based upon the source data definitions and the report headings; creating one or more function keywords for identifying an arithmetic relationship for the data values in the data collection based upon the report logic; creating blank or NotApplicable form and function keywords to indicate when a form or function keyword in the procedural type field does not apply for some user data. Next, the form keywords and the function keywords are associated with the source data based upon a conversion mapping table that specifies which of the form keywords and which of the function keywords are assigned to the source data definitions so as to create user data. A user data database schema is created by assigning names and data types to each of the organizational type fields, the value type fields and the procedural type fields and the user data is stored in a relational database using the user database schema. Lastly, the user data is presented from the relational database so that standard database queries are used to present the data without the need for additional algebraic processing, whereby changes to the presentation of the user data do not require changes to the user database schema, but rather the changes to the presentation of the user data are based upon changes to one or more of the source data, the source data definitions, the conversion mapping table, the form keywords and the function keywords.
Significant savings can be achieved with a new Relational Database 116 data definition design that eliminates changing the schema to reflect new types of information. Stored reports that are redesigned to display data directly via SQL functions without compiled code special handling of data content, will not need to be changed as new types of data are added. Changes to the record format of the directory assistance utilization information data collection on the Operator Workstation 110 can then be implemented without requiring migrations to a new relational database and new stored reports. The addition of a conversion routine that converts the record collected from the Operator Station 110 into redesigned records can also eliminate the need to change the software on the Central Data Collector 304. By designing the conversion routine using a table driven approach, software changes to this routine can also be eliminated.