Computerized personal information retrieval systems exist for identifying and recording segments of radio or television broadcasts that contain topics that a user desires to record. The desired segments are usually identified based upon keywords input by the user. In a typical application, a computer system operates in the background to monitor the content of information from a source such as the Internet. The content selection is guided by the keywords provided by the user. When a match is found between the keywords and the content of the monitored information, the information is stored for later replay and viewing by the user. Although the downloaded information may include links to audio or video clips that can also be downloaded by the user, the selection of the information for storage is based primarily on the frequency at which the keywords provided by the user appear in the text of the broadcast materials.
A computerized personal information retrieval system that allows users to select and retrieve portions of radio or television programs for later playback usually meets three primary requirements. First, a system and method is usually available for parsing an incoming video signal into its visual, audio, and textual components. Second, a system and method is usually available for analyzing the content of the audio and/or textual components of the broadcast signal with respect to user input criteria and segmenting the components based upon content. Third, a system and method is usually available for integrating and storing program segments that match the user's requirements for later replay by the user.
A system that meets these requirements is described in U.S. patent application Ser. No. 09/006,657 filed Jan. 13, 1998 by Dimitrova (a co-inventor of the present invention) entitled “Multimedia Computer System with Story Segmentation Capability and Operating Program Therefor.” U.S. patent application Ser. No. 09/006,657 is hereby incorporated by reference within this patent application for all purposes as if fully set forth herein.
U.S. patent application Ser. No. 09/006,657 describes a system and method that provides a set of models for recognizing a sequence of symbols, a matching model that identifies desired selection criterion, and a methodology for selecting and retrieving one or more video story segments or sequences based upon the selection criterion.
U.S. patent application Ser. No. 09/006,657 does not specifically address the problem that results when the methodology for segmenting the broadcast information into independent stories is primarily centered on visual content rather than video segmentation enhanced with audio and textual analysis. Currently, when video segmented stories are classified according to keywords, the analysis is based upon the assumption that the detection and required frequency of specified keywords in the segment provides an indication that the whole segment can be categorized by a single set of keywords. In actuality, there is a high probability that the frequency of appearance of specific keywords may change with time across a broadcast segment even when the video criteria for story segmentation are satisfied.
Therefore, video segmentation may result in one or more segments that have multiple stories within the section and that may be classified according to a single set of keywords. The resulting keywords that are selected to classify a segment with multiple stories may or may not be applicable to each story within the segment.
There is therefore a need in the art for an improved system and method for identifying and segmenting broadcast information. In particular, there is need in the art for an improved system and method for identifying and segmenting broadcast information according to keywords. More particularly, there is a need in the art for an improved system and method for automated classification of the text of individual story segments that occur within a broadcast text over a period of time.