Personal agents are computer programs that act on behalf of individuals, especially to perform routine, tedious, but not particularly difficult or novel tasks. The coordination, scheduling, and information gathering tasks of professional work generally require communication among individuals. These tasks are often carried out using the telephone and are prime candidates for the support of such a personal agent.
The telephone is a convenient tool for communication, not only because of its relative low cost but also because of the almost universal availability of telephone service. Telephone communication permits the natural conversational structure inherent in face-to-face communication to take place over long distances. However, even routine communications by telephone may suffer when a called party is unavailable. When that occurs, answering machines, voice messaging systems, and even a human operator provide a means for leaving a message; however, an oft-occurring problem is that of "telephone tag" where two parties keep trading messages to call the other. Another problem that often arises in attempting to reach a called party is the time spent while placed on hold or in navigating telephone menu systems.
Electronic mail, generally known as e-mail, provides an alternative way of communicating over long distances. E-mail does not suffer from the "tag" problem because that form of communication does not require the recipient to be in a position to observe the message at the time the e-mail message is transmitted; one may retrieve and read e-mail at any time after the message is sent, as long as the message remains electronically stored at the recipient's end. However, unlike the telephone system, e-mail is far from universally available, and use of e-mail typically requires access through a computer that is relatively expensive in comparison to a telephone; e-mail access in some environments may also require interconnection of computers through an expensive local area network. Further, e-mail does not generally maintain the conversational structure inherent in person-to-person communications; follow-up questioning may be cumbersome and generally requires additional exchange of e-mail messages. While there is e-mail technology that permits auto responses, the technology appears to be limited to capabilities such as return receipt, automatic transmission of canned messages, and automatic subscribing activities over the Internet in response to a formatted request. There is little, if any, analysis and reporting based upon messaging content.
Similarly, the telemarketing field makes little attempt to analyze the content of responses received but rather is geared toward analysis of call response patterns for the purposes of determining the allocation of resources to maximize success in making outgoing calls or in handling incoming calls. Once a call is connected it is then turned over to a live operator or, perhaps, to an interactive voice response system.
Related communication delay problems arise even with relatively simple information retrieval requests. The following example of a typical information seeking dialogue is illustrative.
Suppose it is desired to find out the price of a certain portable CD player at various stores. A person calling a store might be answered by a clerk who asks what is wanted. After the caller responds that she seeks the price of the CD player, she will likely be asked to hold while the clerk locates another employee with more information--information likely to be available on a database. Eventually, someone in the proper department will pick up the phone and ask again what is wanted. The caller will repeat the request and, perhaps after more waiting, may get an answer.
Another store may place the caller in what is known as an interactive voice response (IVR) system--typically a menu-driven system in which a caller sequentially selects various options by pressing a button on the telephone keypad in response to a set of choices. Eventually, after pressing a series of buttons, the caller may be placed on hold waiting for the next available representative according to the menu selections. Once reached, the representative might consult a database to provide the requested information. Similarly, in using an IVR system to get information about an item, the user constructs the item's description incrementally by responding to a series of menus and prompts. For example, a store having an IVR system for delivering information about items that it carries may have a main menu that tells a caller "For VCR players, press 1. For TVs, press 2. For audio components, press 3 . . . ". In response to a selection, e.g., "3" for audio components, another menu might give options like "For integrated systems, press 1. For receivers, press 2. For CD players, press 3 . . . ". Traversing a sequence of menus eventually leads callers to the items they are interested in.
The conceptual simplicity of a caller's task--"I just want to find out the price of the Brand X Model A portable CD player"--and the routine and tedious nature of the interaction suggest it a good candidate for automation by a personal agent. However, the details of the interaction are unpredictable. An agent must determine whether it is engaging a person or an IVR system, when a question is asked, when it is put on hold or transferred, etc.
Furthermore, engaging in this type of interaction using an automated process appears to require the capability of speech recognition and language understanding in an unconstrained environment; that is, the speech from the information source would not necessarily be limited to a set of responses from an expected recognition grammar, such as "yes" or "no", or the days of the week, or the time of day. It is known to successfully employ prompt-constrained speech recognition processes where the expected speech is limited to words uttered in response to a message, e.g., recognition of "Monday" or "Tuesday" spoken in response to a prompt asking for a day of the week. However, to enable an automated response to speech that is not constrained by an expected recognition grammar such as listed above would require speech recognition capabilities that are beyond the current state of the art.
There appears to be some Internet-based personal agent technology having rudimentary capabilities. For example, there is a reference to "Clearlake Personal Agents" at World Wide Web site http://www.guideware.com which appears to be a design tool for designing a software agent to "perform, coordinate and track complex processes over time" over the Internet. Similarly, a reference to a software product called "PersonaL-Agent" is found at http://www.pls.com (under /products/agnt1.html) which appears to perform the task of periodically retrieving information from full-text databases such as news feeds or posted text. However, such agent technology is not audio-or voice-based, and not implemented in a telephone network environment. Internet-resident agents share the disadvantages of e-mail, such as requiring the use of a relatively expensive personal computer to establish an electronic connection to a less than universally-accessible network--in this case, to the Internet, which has lesser accessibility than e-mail.
One telephone-related system called Wildfire appears to handle some rudimentary telephone chores, such as call screening, routing and announcement, voice dialing, call scheduling and reminding, voice mail integration, paging and call conferencing. While the Wildfire system appears to have the advantage of allowing its functionality to be accessible from any telephone or mobile phone, Wildfire does not offer automatic message building and delivery, it does not offer the capability of analyzing and reporting messaging results back to the sender, nor does it offer information retrieval capability.
What is desired is a way to utilize the advantages of the telephone system while providing a way to automate some of the routine communication tasks of scheduling, coordinating, gathering information and reporting so as to reduce the time engaged in "telephone tag" and other unproductive delays. Also desired is a way of engaging in automated information retrieval from sources reachable by telephone.