This invention relates to analysis of the human voice as an aid in detecting, diagnosing and treating psychiatric disorders and particularly in detecting suicidal predisposition.
The prior art references known to applicant and believed most relevant to the patentability of this invention are U.S. Pat. Nos. 3,278,685; 3,855,416; 3,855,417; 3,855,418; 3,971,034; 4,093,821; 4,139,732 and 4,142,067 and the following publications: xe2x80x9cTeaching the Perception of Expressive Aspects of Vocal Communicationxe2x80x9d, appearing at pages 107 through 115 of the August 1967 issue of the American Journal of Psychiatry, and xe2x80x9cInfra-content Channels of Vocal Communicationxe2x80x9d appearing as Chapter 29 of Disorders of Communication, published in 1964 by the Association for Research in Nervous and Mental Disease. Two additional publications known to applicant, which applicant does not concede to be prior art with respect to this invention, are xe2x80x9cSpeech and Disturbances Affectxe2x80x9d appearing as Chapter 17 in Speech Evaluation In Psychiatry, published in 1981 by Grune and Stratton, Inc. and page 8 of a recent publication entitled Medical Bulletin. 
Of the patent literature, the ""416, ""418 and ""034 patents are believed the most relevant to patentability of this invention.
""416 and ""418 are directed towards lie detection by detecting emotional stress in speech by analyzing characteristics of the speech waveform. These patents are believed limited to analysis of waveforms produced upon utterance of the words xe2x80x9cyesxe2x80x9d and xe2x80x9cnoxe2x80x9d; the analysis includes detecting aperiodic amplitude modulation within a preselected frequency envelope and thereafter weighing the detected amplitude modulations with a detected peak amplitude. The weighted function is displayed and compared to a preselected criteria after which the yes/no response, which produced the analyzed signal, is flagged as indicative of emotional stress and, therefore, possibly indicative of an untruthful answer by the subject under interrogation.
""418 teaches isolation and counting of the aperiodic amplitude modulations within the envelope and then displaying the count of the aperiodic modulation for each utterance rendered. From this an observer determines the level of emotional stress associated with a yes/no response and, therefore, whether the yes/no response was presumably truthful.
""034 is concerned with stress detection and records an utterance on a visible medium in order to identify frequency components indicative of stress. Infrasonic frequency signal, in the 8-12 Hz frequency range, which is below the audible range, is analyzed. Frequency shifts in this infrasonic signal of interest are considered to be stress indicators. Stress is allegedly detected independently of the linguistic content of the utterance.
The other patents are believed to be less relevant. ""417 teaches filtering the human voice to provide a single frequency region signal, preferably in the region of the fundamental pitch of either the male or the female voice. A second frequency region of the speech signal, preferably a higher frequency region, is also filtered and rectified. Peak energy values from the envelopes of the two frequency regions are stored and compared in order to determine the stress state of the patient.
""685 detects slope reversals and zero crossings of amplitude-time curves produced from utterances. ""685 notes that such slope reversals and zero crossings may be used to analyze presence or absence of stress or to detect or distinguish among different words.
""821 relates to speech analysis in which pitch or frequency changes are analyzed to determine the emotional state of the speaker. A first formant frequency band, extending from the fundamental frequency to about 1,000 Hz, is analyzed to find knolls or flat spots in an FM demodulated signal of the speaker. Small differences in frequency between short adjacent knolls are taught to be indicative of depression or stress whereas large differences in frequency between adjacent knolls are indicative of looseness or relaxation.
""732 utilizes a signal from a laryngograph which is partially clipped and rectified to produce a signal which can be smoothed with a very small time constant to give a good indication of a voice. The laryngograph produces larynx closure signals without interfering with the speech of the speaker; these are used to help deaf people learn to speak.
""067 is a continuation-in-part of ""821 and teaches that a small amount of frequency modulation in a speaker""s voice is indicative of mild stress while a normal level of frequency modulation indicates no stress. Appropriate lights (green, yellow and red) are turned on in response to the sensed state of stress as indicated by frequency modulation, or lack thereof, in the speaker""s voice.
In the non-patent literature conceded to be prior art, xe2x80x9cTeaching the Perception of Expressive Aspects of Vocal Communicationxe2x80x9d discloses that the human voice can be described in terms of its temporal intensity and frequency characteristics, both of which convey information concerning the speaker. The article suggests converting verbal signals to visual analogs for analysis. The speech signal is filtered and pressured speech, depression and mania are alleged to be indicated by the visible filtered representation of the voice.
xe2x80x9cInfra-content Channels of Vocal Communicationxe2x80x9d teaches that speech intensity is a function of emotional state where the emotional state is defined by whether the speaker is giving a truthful or untruthful response to a stimulus.
These references, whether taken individually or in combination, do not suggest detecting suicidal predisposition in accordance with this invention.
A principal object of this invention is to provide a method for detecting suicidal predisposition by analyzing the voice.
Another object of this invention is to provide a method for detecting suicidal predisposition independently of linguistic content by analyzing the voice.
Yet another object of this invention is to provide two different methods, which may be practiced independently or together, for detecting suicidal predisposition by analyzing a speech signal, where the analysis is independent of the linguistic content of the speech.
This invention provides a method for detecting human suicidal predisposition using a vocal utterance, which is independent of linguistic content of the utterance.
In one embodiment the invention may begin with converting the utterance into an electrical signal having time varying amplitude and frequency representative of the utterance.
Since a reasonably pure voice signal from the person of interest is required for analysis, filtering may be necessary and/or desirable. Components of the signal may be filtered above and below preselected frequencies to obtain a signal within preselected frequency boundaries. Non-repetitive components having amplitude above some average amplitude of the signal may be filtered out of the signal. Repetitive signal components having frequency outside frequency bandwidth of the signal may be filtered out of the signal.
Once a reasonably pure voice signal from the person of interest has been obtained, the person is then identified as suicidally predisposed if signal amplitude exhibits a substantially non-instaneous decay to zero upon conclusion of the utterance. Alternatively, or complementally, the person is then identified as suicidally predisposed if signal amplitude modulation is low during the utterance.
The invention may further include eliciting a vocal utterance and recording the utterance or otherwise converting the utterance into a digital or analog electrical signal. Signal analysis can be done in either analog or digital format. Identification of the human as suicidally predisposed, if signal amplitude exhibits substantially non-instantaneous decay to zero upon conclusion of the utterance or if signal amplitude modulation is low, may be performed by preparing a display of the amplitude varying signal and visually examining the display for substantially vertical drop of the amplitude varying signal to zero upon conclusion of the utterance or for amplitude frequency modulation being low or for both.
Upon identifying the individual as being suicidally predisposed on the basis of the utterance, the individual may be restrained and/or medicated, depending on the judgment of the attending physician or other medical personnel.