1. The Field of the Invention
The present invention relates to accessing data and, more particularly, to securing audio-based access to application data.
2. Background and Relevant Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g., word processing, scheduling, and database management) that prior to the advent of the computer system were performed manually. More recently, computer systems have been coupled to one another and to other electronic devices to form both wired and wireless computer networks over which the computer systems and other electronic devices can transfer electronic data. As a result, many tasks performed at a computer system (e.g., voice communication, accessing electronic mail, controlling home electronics, Web browsing, and printing documents) include the exchange of electronic messages between a number of computer systems and/or other electronic devices via wired and/or wireless computer networks.
Networks have in fact become so prolific that a simple network-enabled computing system may communicate with any one of millions of other computing systems spread throughout the globe over a conglomeration of networks often referred to as the “Internet”. Such computing systems may include desktop, laptop, or tablet personal computers; Personal Digital Assistants (PDAs); telephones; or any other computer or device capable of communicating over a digital network.
In particular, telephony applications provide audio-based access to application data and often do not require access to a computer system. For example, using only a standard telephone, a user can dial into a telephony application and access application data (e.g., bank account information or the status of an order). Interfacing with the application data is initiated using various audio-based commands. For example, a user can submit spoken words through the microphone and Dual Tone Multi-Frequency (“DTMF”) tones through the key pad.
Telephony applications can include decoder modules that decode DTMF tones into computer-useable digital data. For example, telephony applications can decode the sum of sine wave tones at 697 Hz and 1477 Hz into data representative of the key pad number 3. Telephony applications can also decode the sum of sine wave tones at other known frequencies into data representative of other corresponding key pad numbers and symbols (1, 2, 4-0, *, and #). Based on the design of the telephony application, the representative data may be interpreted as an actual number or symbol or alternately may have some other meaning. For example, data representing the key pad number 6 can be used to indicate the letters M, N, or O or may be indicative of a specific command.
Telephony applications can also include speech recognition modules that convert spoken words into computer-usable digital data and text-to-speech modules that convert computer-usable digital data into spoken words. At a telephone, a transducer (e.g., a microphone) converts spoken words into corresponding analog signals. The analog signals are transferred over, for example, a Public Switched Telephone Networks (“PSTN”) to the telephony application.
Speech recognition modules receive the analog signals and convert the analog signals into corresponding computer-usable digital data. The speech recognition modules then compare the corresponding computer-usable digital data to stored digital data to identify or at least hypothesize on what was originally spoken into the microphone. The telephony applications can interpret the identified or hypothesized spoken words as a command. For example, identification of the word “checking” can be interpreted as a command to access a checking account.
In response to audio based commands (DTMF tones and/or spoken words), a telephony application can return application data to a user. For example, in response to a query for a checking account balance, speech recognition modules can convert stored digital data (the account balance) into a corresponding analog signal representing a checking account balance. The telephony application can send the analog signal to the telephone. The telephone receives the analog signal and a speaker converts the analog signal into spoken words, such as, for example, “your account balance is three-hundred twenty-four dollars and fifty-nine cents.” Thus, telephony applications generally make application data more accessible.
Similar to other types of application data access, telephony applications often require that a user authenticate before access to application data is provided. Unfortunately, since the input interfaces for telephony interfaces are only voice and keypads, the type and length of authentication data that can be used is severely limited. Further, telephony applications are often utilized in public locations. Thus, it may be inappropriate to rely on voice input (e.g., spoken passwords) for authentication, since voice input could be overheard.
Accordingly, many telephony applications rely on numeric PINs entered using a telephone keypad as a primary method of authentication. However, since the input space is limited to 0-9, the complexity of passwords based on the input space is also correspondingly limited. Further, many users desire a PIN that is easy to remember and thus may not be willing to compensate for the limited input space by using longer passwords. For example, a typically user PIN consists of four digits and thus provides only 10,000 (104) possible different combinations. Accordingly, telephony applications are frequently subject to brute-force password attacks. For example, a malicious user may dial into a telephony application and enter possible combinations (either randomly or serially) from 0000 to 9999 to attempt to authenticate, until access is granted. In the event of a failure (e.g., too many incorrect PIN numbers), the malicious user simply hangs up and dials in again.
Further, while telephony applications are designed to provide application data access to telephone users, general-purpose computer systems can be configured to simulate telephone functionality. For example, a malicious user can configure a computer system with a modem to automatically and repeatedly dial into a telephony application and enter every possible combination of numbers for a specified input space, until access is granted. These automated brute-force attacks can make even longer passwords based on the 0-9 input space vulnerable.
When application data is accessible from computer systems or telephones having limited physical access, such as, for example, in office environments, it may be appropriate to disable an account after a specified number of failed authentication attempts (e.g., three). However, when application data is accessible from public computer systems or telephones, disabling accounts may be inappropriate. For example, a malicious user can use a publicly accessible telephone or computer system to repeatedly enter an incorrect PIN on purpose to disable a legitimate user's account (a type of “denial of service” attack). Thus, the legitimate user is then prevented from accessing the application data and may be required to obtain a new PIN (which are often delivered using ground based delivery mechanisms) to gain access. Therefore systems, methods, and computer program products that facilitate securing audio-based access to application data would be advantageous.