This invention relates to a large scale data warehousing system to support and supply an automatic real-time personalized intelligence network that actively delivers personalized and timely informational and transactional content from an OLAP-based system to individuals through use of a high-speed processing and output delivery system to email, pager, mobile phone, fax, telephone, personal digital assistants, wireless-access protocol (WAP) devices and other terminal devices. The invention stores, retrieves and maintains the large quantity of information needed to service that range of information delivery in the underlying service. Users may subscribe to various channels of content, and to specific services within each channel that are delivered when a predetermined condition occurs (e.g., based on a schedule, when an exception condition occurs, or in response to a specific initiation request).
Information is most useful when it is delivered to the right person at the right time. Delivery of the right information to the right person has been a problem that many businesses have attempted to solve over the years. Indeed, an entire industry of decision support technology exists to deliver information to members of a business based on massive amounts of data collected about the businesses. While many such systems exist, most are implemented for delivery of information to businesses and not to individuals. These systems also usually require that a user log-in to the system to seek out information. If information of interest changes rapidly, users must continuously log-on to the system to check for updated information.
Decision support systems have been developed to efficiently retrieve selected information from data warehouses. One type of decision support system is known as an on-line analytical processing system. In general, OLAP systems analyze the data from a number of different perspectives and support complex analyses against large input data sets. There are at least three different types of OLAP architecturesxe2x80x94ROLAP, MOLAP, and HOLAP. ROLAP (xe2x80x9cRelational On-Line Analytical Processingxe2x80x9d) systems are systems that use a dynamic server connected to a relational database system. Multidimensional OLAP (xe2x80x9cMOLAPxe2x80x9d) utilizes a proprietary multidimensional database (xe2x80x9cMDDBxe2x80x9d) to provide OLAP analyses. The main premise of this architecture is that data must be stored multi-dimensionally to be viewed multi-dimensionally. A HOLAP (xe2x80x9cHybrid On-Line Analytical Processingxe2x80x9d) system is a hybrid of these two. Each of these types of OLAP systems are typically client-server systems. The OLAP engine resides on the server side and a module is typically provided at a client-side to enable users to input queries and report requests to the OLAP engine. Many current client-side modules are typically stand alone software modules that are loaded on client-side computer systems. These systems require that a user must learn how to operate the client-side software module in order to initiate queries and generate reports.
An OLAP product developed by MicroStrategy, known as MicroStrategy Broadcaster,(trademark) leverages this decision support technology for automatic delivery of reports based on database contents. MicroStrategy Broadcaster is an OLAP based system that provides businesses and other users with the ability to set up xe2x80x9cservicesxe2x80x9d to which participants may subscribe. The service provides content based on data in a database, such as a data warehouse, and may be personalized to users"" tastes. For example, while a service may be generated for stock in the warehouse of a company, different sales managers may only want to know the stock for a particular product line. Those sales managers may then personalize the report generated by MicroStrategy Broadcaster(trademark) so that the report only includes information about the product line of interest.
Although some push technologies have been developed for automatically delivering content to users, most systems simply xe2x80x9cdumpxe2x80x9d information about a particular subject without regard to users"" particular preferences or interests. Some such technologies are available on the World Wide Web and the Internet.
The World Wide Web and the Internet have provided an avenue for information delivery, but current Web-based systems still fail to adequately deliver the right information at the right time. One of the major problems with the World Wide Web is the requirement to utilize a computer and web-browser to access its contents. Although penetration of computers throughout the world has increased, that penetration is far from making information readily available to everyone wherever they happen to be.
Moreover, most computer users connect to the Web through a land line. Most users therefore do not have access to Web content when they are away from a land line. Although technology is being developed to enable World Wide Web access through other mediums, such as web-enabled personal digital assistants, for example, such technology require users to purchase new equipment to access this technology. Given the sparse penetration of personal digital assistants already, this technology does not satisfy the need for delivery of timely information.
Another system in use today is an interactive telephone system that enables users to interactively request information through a computerized interface. These systems require that the user call in to a central number to access the system and request information by stepping through various options in predefined menu choices. Such information may include accessing account information, movie times, service requests, etc.
A problem with these systems is that the menu structure is typically set and not customized to a particular""s users preferences or customized to the information available to that user. Therefore, a user may have to wade through a host of inapplicable options to get to the one or two options applicable to that user. Further, a user may be interested in particular information. With existing telephone call-in systems, that user has to input the same series of options each time they want to hear the results of that report. If the user desires to run that report frequently, the telephone input system described is a very time consuming and wasteful method of accessing that information. Also, if a particular user may only be interested in knowing if a particular value or set of values in the report has changed over a predetermined period of time, in such a system, the user would be required to initiate the report frequently and then scan through the new report to determine if the information has changed over the time period specified.
Further, reports may be extensive and may contain a large amount of information for a user to sort through each time a report is run. Therefore, the user may have to wait a long time for the report to be generated once they input the appropriate parameters for the report.
Therefore, existing systems do not provide a readily available medium for delivery of the right information at the right time or a system for delivering that information. These and other drawbacks exist with current systems.
The scale and range of the information offered to subscribers to a broadbased information service places corresponding demands on data warehousing and retrieval, which must be executed with reliability and efficiency in order to maintain realtime delivery.
This invention provides a system and method for collecting, storing and updating the data storage infrastructure used in generating and delivering message content to a collection of subscribers via e-mail, voice mail and other media on a high volume, high throughput basis. The underlying information service which the invention services receives business, weather, news and other topical information from a variety or sources or feeds to the invention which collects that data, and warehouses it for realtime combination with other information for delivery over selected channels via distributed servers, such as by e-mail.
An illustrative architecture to deliver those services includes a data distribution repository which stores and manages the large quantity of information parsed frorn the information feeds, which may be directed to an OLAP-enabled database or databases. The data distribution repository communicates with a data distribution control system to perform mid-tier slicing across the information databases for information common to all subscribers. The data distribution repository transmits the base information to data distribution servers to process with subscriber filters for the addition of information particular to individuals, before transmission. The resulting high-throughput information service may deliver the resulting content, for example, via plain-text or HTML email messages, spreadsheet data, pages, telephone calls, mobile phone calls, fax transmissions and other formats. The rate of email delivery may for instance be at least 50-60 SMTP transmissions as described below and is scalable, while the types of information content communicated to the end subscribers is extensible.
To permit the reliable storage, retrieval and maintenance of that transmission architecture, the invention deploys the data distribution control server to continuously monitor the functioning of the data distribution repository. As described more fully below, the data distribution repository receives continuous, scheduled or event-triggered data feeds from a variety of topical information sources, such as financial news streams or government or commercial weather updates. The resulting data population is managed by the data distribution control server and other components to maintain the coherency and integrity of that information, for instance under fault conditions, interruptions of service and during the servicing of subscriber inquiries.
During subscriber inquiries in particular, any update process on the full data image of the data distribution repository is suspended so that data consistency is maintained. Data writeback, reversion to former states and other failsafe techniques may be used when an error is encountered during any phase of information retrieval and delivery, as described below. The currency of the personalized intelligence delivered to the subscriber base is maximized, and the reliability of the data service is enhanced.
This invention provides a system and method for providing a personal intelligence network that actively delivers highly personalized and timely information to individuals via e-mail, spreadsheet programs (over e-mail), pager, telephone, mobile phone, fax, personal digital assistants, HTML e-mail, WAP device and other formats. In this system, informational and transactional data may be loaded and formatted into a database system. The database system may then provide a plurality of xe2x80x9cchannelsxe2x80x9d wherein each channel may comprise information and transactional data about a particular field of interest, such as business, weather, sports, news, investments, traffic and others. Subscribers may then sign up to receive output from one or more services from one or more of the channels of information. A service should be understood to be formatted content that is sent to certain subscribers at a certain frequency or based on the occurrence of a predetermined event, such as an update to a database. For example, a service for a finance channel may be called xe2x80x9cMarket Updatexe2x80x9d that sends an email to subscribers every day at 5 p.m. with a summary of the market results for the day. That same service may be scheduled to run periodically throughout the day when new market information is loaded into the finance channel database. These are only two examples of the many types of services that may be processed by the system of the present invention.
A subscriber is any individual or entity that signs up to receive a service. A service may be delivered based on a schedule, an exception (such as an alert trigger condition) or upon initiation by an external system or person. A schedule is the frequency for which a service is sent to be processed (e.g., end-of-day (after 5 p.m.), intra-day (every hour between 10 a.m. and 5 p.m.), end-of-week (5 p.m. on Friday)). A style refers to the presentation of the output of a service (e.g., a different style exists for a pager versus an email output due to the device constraints). Each subscriber may also select to personalize the service content. Personalization may include preferences for types of content, information, etc. that the user desires to receive within the scope of a particular service. For example, for a service that sends an end-of-market report, the user may only want to see the portions of the report that deal with stocks in her individual portfolio. The service output may also include non-personalized content such as in the previous example, the Dow Jones Average for the day.
The system may include local, national and international data to enable users to receive a wide variety of information and data. The system also enables affiliates to participate and include affiliate-specific information in the outputs generated from the system. An affiliate may comprise an entity that establishes a relationship with the host system to distribute content to its subscribers or customers. For example, an Internet service provider, such as Earthlink, may offer its customers the option of receiving information and may then include Earthlink specific information in the content distributed to its subscribers.
According to the present invention, one or more channels of personalized intelligence information are accessed and distributed to subscribers to one or more services provided for each channel. Subscribers may sign up to receive one or more services for each of the one or more channels through a web interface system that identifies each of the available types of information that the user may access. The subscription interface may also be a mobile phone, a land-line phone, or any other method of subscribing. The subscriber may input personalization options through the web interface so that the service output generated for that subscriber is what the user desires to get. The subscriber information may be stored in a subscription database that are periodically provided to the channel databases. The subscription information for each subscriber to a service handled by a channel database may be stored for the service for later processing and generation of service output by the system.
The channel databases are populated with information and other data content through one or more data load systems. The data load systems may receive information through continuous feed systems, such as satellite and land line feeds, or through periodic feed systems, such as an FTP data feed system. The data load system cleanses and categorizes the data and then stores the data in the appropriate channel database for later processing.
Further, a data distribution system is provided that processes services using the information in the channel databases. The data distribution system may comprise a data distribution control system and one or more data distribution servers. The data distribution control system controls the operations of a plurality of data distribution servers to balance the load and generate greater output in an efficient manner. Each data distribution server system may comprise a server control system and a plurality of message generator systems (each of which passes generated messages to a mail formatting system and mail forwarding system). The server control system further breaks down the jobs assigned for each of the plurality of message generator systems. That way, a multiple-tiered processing system is provided to distribute the processing load throughout the system.
A nerve center is provided to control the overall operation of the system. Specifically, the nerve center tracks updates to the channel database and the data load system and controls operation of the data distribution system. The nerve center monitors services to determine whether the data necessary is available in the channel database before the service is processed in that database. Before any service is processed by a data distribution system, the nerve center is notified and grants approval. The nerve center is also responsible for monitoring for system performance to avoid errors and faults and has the capability to redirect work within the system to overcome errors or faults with any particular component of the system.
As an example of the present system, a finance channel may be provided that has information about Investments. A separate channel database may be established that contains information for the Finance channel. Within the Finance channel, a plurality of different services may be created, such as Market Update, Stock Portfolio Update, Low P/E in Sector, Biggest Gainers, Biggest Losers, etc. When the service is set up, the predetermined condition for when a service is to be processed may be specified. For this service, the schedule may be hourly (or shorter as the user may specifyxe2x80x94even every minute or less if the subscriber were so inclined). Subscribers to the Finance channel may then sign up to receive information from the Finance channel and specifically, may sign up to receive one or more of the services. For each service, the subscribe may also personalize the service, such as requesting a Stock Portfolio Update only for that subscriber""s individual stocks or signing up for Biggest Gainers only for a particular industry sector. Also, the subscriber may only want to receive updates every three hours instead of hourly.
Alert services for the finance channel may also be provided. In this embodiment, a subscriber may select to be notified immediately after his stock portfolio experiences a predetermined amount of change, such as 10%, for example.
The services, predetermined processing conditions, and subscribers thereto are then also stored in the Finance channel database. Once a service has been added to a channel database, the nerve center is informed to monitor to make sure that the service is executed. The nerve center may either assign a service to a specific data distribution system for that data distribution system to process every time or may place the service in a scheduling queue to be assigned to a data distribution system when predetermined condition for processing occurs.
When that occurs, the nerve center first checks to ensure that the data to be used for that service has been loaded into the channel database. For example, if the service is a Market Update, the nerve center checks the Finance channel to make sure that the end-of-the-day market information has been loaded into the Finance channel before beginning to process the service. If not, the service may be delayed until the data is available. When the data becomes available, the nerve center then tasks the data distribution system with processing a particular service.
The data distribution system may then process a service in several ways. A service may comprise a collection of sub-services, one or more of which are to be processed prior to one or more others. In such event, each of a plurality of different data distribution servers processes a separate sub-service. Each sub-service may be further broken down into jobs for each of the message generators managed be a server control system within the data distribution server.
If a service relates to only a single task, then the service may be assigned to a data distribution system. The data distribution control system may break the service into a plurality of jobs assigned to a plurality of different data distribution servers. A server control system for the data distribution server may further break each of the jobs assigned into a plurality of batches, each batch being handled by one of a plurality of message generators. The message generators process each item within a batch and generate the appropriate messages to be output through a message mail formatting system and mail forwarding system to subscribers.
Additionally, to increase throughput, non-personalized content from a service may be processed separately, rather than processing that for each of a plurality of subscribers. The non-personalized content may then be provided to the message generator to include in the messages for each personalized output for the subscribers.
These techniques enable a high throughput output system for processing a plurality of outputs to a large number of subscribers. Other objects and advantages exist for the present invention.