The present invention relates generally to computer systems, and will be specifically disclosed as a method and apparatus for generating semantic tokens.
The virtual explosion of technical advances in microelectronics, digital computers and software have changed the face of modem society. In fact, these technological advances have become so important and pervasive that this explosion is sometimes referred to as xe2x80x9cthe information revolution.xe2x80x9d Through telephone lines, cables, satellite communications and the like, information and resources are ever increasingly being accessed and shared.
Some attempts have been made for computers and software to interpret and understand the content of data. One such attempt is sometimes referred to as linguistic morphology, which in general terms involves applying computational language mechanisms to text. For instance, a two or three page report could be summarized to produce an outline of topics or an abstract using linguistic morphological techniques.
Another attempt for computers to understand content is to attaching a header or description along with a data, such as a PICS (Platform for Internet Content Selection). PICS are used to tag data so as to provide metadata about the content of the data. For instance, a PICS header can be used to indicate where content is violent, pornographic, or the like. PICS typically requires the cognitive input of a human to determine the content of the metadata.
Several search engines, often used with the Internet such as ALTAVISTA and EXCITE, provide relevancy determinations. For instance, when searching for information on the Internet, the search engine will list the Internet sites in order of apparent relevance, and in some instances provide a numerical indication as to the relevance. Typically, relevancy determinations is a function of the number or proximity of xe2x80x9chitsxe2x80x9d from the search query in the site.
One aspect of the present invention is the computer system. A network has a plurality of principals. A content stream in the network is associated with at least one principal. The content stream has a plurality of phrases. A marking tool has access to the content and is adapted to mark phrases in the content stream. A monitoring agent has access to the content stream and is operative to extract the markings. A token creation module is operative to create tokens based on the extracted markings.
Another aspect of the invention is a method in the computer system for generating tokens. A content stream having a plurality of phrases is accessed. One or more phrases in the content stream are marked. The marked one or more phrases are then extracted and processed to determine semantic information. A token(s) is created based on the semantic information.
Still other aspects of the present invention will become apparent to those skilled in the art from the following description of a preferred embodiment, which is by way of illustration, one of the best modes contemplated for carrying out the invention. As will be realized, the invention is capable of other different and obvious aspects, all without departing from the invention. Accordingly, the drawings and descriptions are illustrative in nature and not restrictive.