The present invention generally relates to the area of networks for providing telephony and data resources and more particularly to methods and mechanisms for providing access to networked resources via either voice or electronic data communications.
The overwhelming majority of access to computer resources today from remote locations has been via remote electronic data communications. There are many forms of such access including for example modems or digital subscriber lines. Remote users communicate with, and access the resources of, a local system via a personal computer or computer appliance, such as for example a palm-sized scaled-down version of a personal computer.
Applications typically support connected computers having graphical user interfaces. However, similar interface functionality is not supported for end-devices having voice user interfaces. As a result, a user""s access to the functionality of a particular application or resource is dictated by the manner in which the user accesses the computer system upon which the application or resource resides.
Businesses typically have two systems accessed remotely on a regular basis by their users. A local area network handles data communications, and a private branch exchange (PBX) system handles voice communications. The local area network provides access by users to file and computer applications/servers thereby enabling a user to carry out computer applications on a computer from a remote location. The PBX system enables users to retrieve and respond to voice messages left for the users on the PBX voice mail system. The PBX also enables a remote user to call multiple persons served by the PBX with a single call.
The businesses also include two separate and distinct sets of physical communications lines to their places of business. A first set of lines provide communication links between a public switched telephone network (PSTN) and a private branch exchange (PBX) system including phones and other telephony. A set of PSTN lines terminate at a business site at a PBX connected to a business"" internal phone lines. A second set of lines provide links between external data networks and internal local area networks (LANs) for the businesses. Examples of such lines are T1, E1, ISDN, PRI, and BRI.
In recognition of the potential efficiencies arising from converging two physically and operationally distinct networks into a single network, the network technology industry has sought to define and implement a single, converged, network meeting the demands for all types of communications including voice, facsimile, data, etc. As a result, a new telephony/data transmission paradigm is emerging. The new paradigm is based upon a packet-based, switched, multi-media network. Data and voice, while treated differently at the endpoints by distinct applications, share a common transport mechanism.
Convergence presents the opportunity for the creation of applications including communication interfaces that not only support computer-generated commands, but also voice commands from a remote user. It also presents the opportunity to enhance the variety and flexibility of uses for PBX systems.
One aspect of computer systems accessed remotely via voice commands is the implementation of security measures. Voice interfaces present the opportunity for users to connect to a network from virtually any location. Presently, security mechanisms for restricted access systems accessed via telephone typically rely upon users to enter a number on a touch-tone phone to limit access. However, this method is highly susceptible to eavesdropping. Also, the users are often required to enter a long sequence of numbers that can easily be forgotten. A voice-controlled computer system will require speech recognition functionality. Speech recognition programs and associated xe2x80x9ctrainingxe2x80x9d databases (used to train the software to recognize voice commands from a user) do not guarantee that another user""s speech will not invoke protected operations on the computer system. Thus, if the computer system is to be secure, then additional speaker recognition/authentication procedures must be included in the system.
The use of speaker recognition/authentication processes to protect resources in a computer system is known. Such systems have weaknesses that enable imposters to gain access to the computer system. The simplest voice authentication scheme requires a user to speak a password, and the authentication system verifies the user by comparing the spoken password to an existing copy of the password. An obvious weakness to this authentication procedure is that the security system cannot distinguish between whether the user is the source of the vocalized password or it is merely an electronically recorded copy of the user""s voice.
One solution to the well known xe2x80x9celectronically-recordedxe2x80x9d password scheme is to request the user to utter the password multiple times. The multiple utterances, in addition to being compared to the digitally stored vocal password at the computer system site, are compared to one another to ensure that the utterances are sufficiently different from one another to ensure that a recording of the password is not being replayed multiple times by an imposter seeking to gain remote access to protected computer resources. Of course, the imposter can circumvent this safeguard by making multiple recordings of the password spoken multiple times by an authorized user. Furthermore, copies of a single original spoken password can be altered and then stored to create variations from the original.
What is needed is a speaker authentication scheme wherein imposters cannot use a recording of the user""s voice to render a valid passwords to gain access to protected computer resources. There exist a number of systems that attempt to overcome the shortcomings of voice-based authentication schemes. Such authentication mechanisms include smart cards, secure ID""s, and retina scanners. However, these mechanisms require special hardware at the site from which a user calls.
In accordance with another aspect of a converged wide-area network interface to a computer system, there is an interest to exploit a system wherein telephony and digital data systems share programs and data. Voice-based computer access, described above, is one such effort to exploit converged technology. Once authenticated, a user may access computer resources via voice commands rather than issuing commands by means of a remote computer (e.g., a laptop computer). The user may access a number of applications integrated into the converged local network including databases, file servers, Interactive Voice Response (IVR) servers, call centers, voice mail, PBX hubs/endnodes, and conference bridges.
With regard to the last of the listed potential applications, it is noted that conference bridges are generally implemented today in two ways. One way is to purchase a Conference Bridge with certain capacity. It is then used as a fixed resource like a physical conference room. If a conference bridge has 24 ports it can support one 24-user conference call. It could also support three eight-port conference calls.
Extending the size of a conference via external conference bridging is a challenge to coordinators of a conference. A second conference phone number has to be forwarded to each of the participants who is to be bridged into the conference via the external bridge. Then the external conference bridge calls in to the internal conference bridge. Alternatively, callers could call a number that is received by the PBX handling the conference which in turn forwards the call to an external conference bridge. However, each forwarded call uses two trunks in the PBX system.
Another option is to subscribe to a conference bureau. A bureau is a service that supplies an external conference bridge (and number to call into the bridge). The bureau typically charges a customer based upon the number of users and the duration of the use of the bridge (e.g., per user-minute). External bridges allow for more dynamic meetings however the cost for utilizing external bridges on a regular basis is substantial.
The present invention seeks to exploit the convergence paradigm and/or the ability to communicate with a wide spectrum of end-terminals to enable users access to the resources of both converged and non-converged networks via voice and/or electronically generated commands. For example, an electronic personal assistant (ePA) incorporates generalizing/abstracting communications channels, data and resources provided through a converged computer/telephony system interface such that the data and resources are readily accessed by a variety of interface formats including a voice interface or data interface. A set of applications provide dual interfaces for rendering services and data based upon the manner in which a user accesses the data. An electronic personal assistant in accordance with an embodiment of the invention provides voice/data access to web pages, email, file shares, etc.
The electronic personal assistant enables a user to transmit voice commands to a voice-based resource server to provide information accessible to the resource server. In accordance with an aspect of an embodiment of the invention, a user is authenticated by receiving vocal responses by a user to one or more requests variably selected and issued by a speaker recognition-based authentication facilityxe2x80x94thereby ensuring that every time a user logs into the network there is a unique challenge response to gain access to the network resources. A spoken response is compared to one or more stored voice samples previously provided by the user during an enrollment procedure. If the spoken response is sufficiently close to the one or more stored voice samples, then the user is authenticated as a domain user or logged onto the local system. The voice-based authentication facility enables a user to log in to a computer without the aid of a keyboard, smart card or such. This would work in a kiosk environment. Thereafter, an application proxy is created. The application proxy acts on behalf of the authenticated authorized user.
In accordance with particular aspects of the specific embodiments of the invention, a set of remotely accessed voice applications are provided. One such application comprises a personal interactive multimedia response (IMR). Each user configures a personal IMR system. In a converged network environment, the user is provided access to the IMR through a personal computer interface, web interface, instant message, e-mail, as well as a voice user interface over a telephone connection.
A configurable distributed conference bridge is another potential application incorporated within the converged network architecture model of the present invention. The distributed conference bridge enables local conference resources to be utilized and incorporates external service bureau conference bridge resources when needed to supplement the internal conference bridge resources of a system. This may or may not require any user intervention to create the bridged conference bridge. The dynamically configurable extensible conference bridge application supports standard voice conference calls, multimedia conference calls, and blended conference calls. As a consequence a customer need not provision in-house conference bridge resources, switch resources, or trunks for a worst case scenario and the conference bridge may be used on a more ad hock basis since it can dynamically grow to meet the demands of the conference.
In accordance with an aspect of a preferred embodiment of the conference bridge application, in addition to manual call set-up with regard to the overflow connections to the external bridge, the conference bridge application supports automatic redirecting head end conference phone numbers utilizing remote call forward, QSIG, PINT, and/or in-band signaling.