Uninhabited air vehicles (UAVs) are specifically designed to operate without an onboard operator. UAVs are of multiple vehicle types, various sizes, and intended for diverse roles. The adoption of UAVs for a variety of current and future missions depends upon increasing their safety and performance.
UAVs now operate in complex scenarios using sophisticated technology. These complexities are expected to increase as their roles become more diverse. UAV safety requirements include FAA standards, collision avoidance and situational awareness. Communications is of critical importance for each of these requirements and for increasing UAV performance. UAVs must communicate among and between other vehicles, remote, human-operated control sites, and at ground sites. Safe interaction with these parties is a critical requirement for wide deployment of UAVs.
Air Traffic Control (ATC) is currently an analog, wireless, voice communication and sometime text based process that UAVs must successfully encounter. This requires that any autonomous air vehicle must appear to ATC as a human pilot controlled vehicle. The UAV must listen to and respond with natural human language. Since UAVs have been unsuccessful at these tasks, they have only been deployed in areas far from commercial air traffic for safety reasons.
Voice processing systems will allow UAVs to fulfill missions in a safe and efficient manner. They have become popular for simple, non-critical interactions. Most commonly, these systems have been used on telephone networks to acquire and dispense information for callers. In such a system, a caller is able to designate a party to be called, which activates the automatic retrieval of a pre-registered telephone number.
Voice processing systems have recently been implemented in more diverse and sophisticated areas including automobile navigation systems. These particular systems are able to interpret human vocal input directed towards the efficient navigation of a motor vehicle. A navigation limited vocabulary is utilized to respond to navigation limited input. The navigation of the car is dynamically linked to some global positioning system to coordinate the location of the vehicle with a stored map.
A voice processing system quantizes and samples sound waves to generate digital data representing amplitude and frequency of the waves. The sound waves are input through a microphone. The amplitude of the analog voltage signal is measured by an analog-to-digital converter and converted to representative binary strings. These binary strings are stored in a memory. A control system is used to relate incoming voice data with stored voice data. A digital-to-analog converter is used to transform the binary strings back to wave energy that may be output through a speaker.
These systems are typically composed of two units, voice recognition and voice synthesis. The primary difficulty with current voice recognition units is the need for large capacity databases and sophisticated algorithms for discriminating and parsing the incoming vocal data. These systems rely on analog signals that are input through a microphone. These signals are transformed to digital signals by an analog-to-digital converter. The system then analyzes the digital signal, recognizes the data and automatically retrieves information that has been stored in the system memory in response.
Large databases of sounds, words and word combinations are required to anticipate the many possible inputs. This is especially true when voice recognition is employed with human interaction. Sophisticated algorithms are required to discern the intended input from noise, interference or unintended utterances.
Speech synthesis, or voice generation, also requires large databases and sophisticated algorithms. Voice data is stored in a database and retrieved when appropriate. A digital to analog converter is utilized to transform the digital data from the memory to an analog signal resembling human voice made audible through a speaker. Wavetable synthesizers generate sound through processing sound waveforms stored in a wavetable memory. Again, to anticipate accurate communication with human interaction requires storing large and many multiple waveforms.
To increase the accuracy of voice processing systems, larger databases, faster processing, and honed extraction algorithms may be used. Each of these solutions is limited. Thus, one must find a balance between these elements to achieve satisfactory results for particular situations.
Voice processing systems will be more widely used if they perform better. More complex tasks may be automated with improved voice processing. UAVs may be more widely deployed and utilized for more complex missions with accurate and efficient voice processing.
Text processing is also needed for successful UAV deployment. Commands may come in the form of digital text that must be parsed, defined and its meaning deciphered. The UAV should also be able to output appropriate responses or initiate a dialog via text output.
Much of text processing is similar to voice processing. Incoming words and phrases are compared to stored words and phrases on a database. Generally, less signal processing is required but interpretation is still a difficult task.
Useful voice and text processing for a UAV is also benefited by natural language processing (NLP) systems. Natural language processing are automated methods of discerning the meaning of arbitrary language expressions such as phrases and sentences. Natural language processing depends on algorithms that determine the definition of words within the context of the phrase or sentence the word or phrase finds itself. By determining definitions within context, phrase meanings and sentence meanings are determined.
Several methods of implementing dialog management exist for NLP. One method allows the specification of context by specifying the likely dialogs predicted to take place specified in a grammar that has certain variables left undefined. These variables are known as slots. The variables in the slots are filled in when surrounding context is matched during executing of the dialog. In the present state of the art, dialogs defined by a grammar are specified long before they are intended to be used and do not change throughout the lifetime of the product. Although tools exist to help automate and test grammar construction, it often requires human intervention in order to be fine-tuned for use by humans.
Another NLP method utilizes dialog state information to determine dialog meaning. States are predetermined and the actions or responses are predefined for each transaction. A state table is developed containing the present state of the dialog, and the responses appropriate for each state. This table is referred to by the dialog manager to generate proper responses.
Natural language processing abilities will enable a UAV to function more seamlessly within its active environment. This includes more accurate communication between the UAV and ATC.