This invention generally relates to a teaching device to assist music students in recognizing and producing accurate pitch, timbre (tonal quality), and timing (meter) on their musical instrument and more particularly to an electronic apparatus to quantify and provide visual feedback of the musical performance of a student to that of a musical reference.
A student of music, for purposes of description, is anyone who is trying to play a musical instrument. The invention disclosed herewith discusses musical instruments that produce a tone of detectable pitch. This includes the human voice, violin, and flute and excludes most percussive instruments (e.g., snare drum and tambourine). A tone has aural parameters that include pitch, amplitude, duration, and timbre. When used in the context of `audible tone reference`, tone can include any combination of pitched and unpitched sound sources (e.g., a band with a percussion section).
A basic ability required of a student of music is to produce and sustain a musical tone of defined pitch and good timbre. This task is easy on an instrument like a piano which mechanically quantizes pitch and constrains timbre. Singers, however, must dynamically adjust their vocal muscles to control pitch and timbre based on their aural perceptions. Similarly, violinists must adjust their bowing and fingering based o their aural perceptions.
The importance of these aural perceptions is demonstrated in the difficulty deaf children have learning to speak. If the internal discernment of pitch and timbre is not developed in an individual, some external feedback is necessary. In their paper titled "Computer-Aided Speech Training for the Deaf" (Journal of Speech and Hearing Disorders February 1976 Vol. 41, No. 1), R. S. Nickerson, D. N. Kalikow, and K. N. Steven report on a computer-based system that uses visual displays of speech parameters (e.g., pitch, amplitude, and spectrum) to aid speech training for the deaf.
In music instruction, a student's aural perceptions are typically developed through collaboration with a music teacher who points out, by verbal comment and audible example, the pitch, timbral, and timing errors of the student. Teaching musical skills are complicated by the fact that sound, unlike paintings, cannot directly be seen and only exist when played. Audio tape recorders allow students to review their performances, but do not provide any analysis.
A system of entertainment that offers learn-by-example instruction is the karaoke system popularized in Japan. A karaoke system (literally Japanese for "hollow orchestra") consists of a pre-recorded audio source, a microphone, audio mixer, amplifier, and speaker. The audio source material, typically a compact or laser disk (LaserKaraok.RTM. Pioneer LDCA, Inc., 2265 East 22th Street, Long Beach, Calif. 90810), is specially prepared with musical accompaniment on one channel and a solo vocal reference on the other. The musical accompaniment can be any musical instruments that provide tonal support for the singer. The accompaniment is usually a band or orchestra but could simply be a piano, other vocalists, or a guitar. The reference channel is typically the solo voice of a trained singer, or a solo instrument like a clarinet or monophonic synthesizer. The karaoke system allows the singer to independently adjust the volume of his voice, the accompaniment, and the reference solo voice. Typically students would practice singing with the reference solo voice and accompaniment. After they have learned the words and are comfortable singing the melody, they turn off the reference solo voice and sing, unassisted, with the accompaniment. More elaborate karaoke systems use a laser disk or CD&G compact disk (a format that encodes graphic images with audio) that display song lyrics on a video monitor which change color as each word is sung (analogous to "the bouncing ball" technique). Karaoke systems do not evaluate the singer's performance and hence students must rely on their own musical perceptions for guidance.
Electronic devices exist which visually indicate the instantaneous absolute pitch and error of a tone source (e.g., Sabine ST-1000 Chromatic Auto Tuner, Korg DT-2 Digital Tuner, Arion HU 8400 Chromatic Tuner). Mercer U.S. Pat. No. 4,273,023 discloses a device that displays the instantaneous absolute pitch of a musical instrument with an array of LEDs arranged on a musical staff but can only display the pitch of one tone source at a time. Tumblin U.S. Pat. No. 4,321,853 discloses a system that measures the instantaneous pitch of a musical instrument relative to an electronically generated reference tone and displays the difference (the pitch error) on a column of lights. Neither of these systems provide a time history of pitch nor do they provide any quantative indication of timbre or amplitude.
The system of Nickerson et al. displays a time history of pitch, duration, and timbre but is not well suited for musical instruction. The system uses a miniature accelerometer applied to the throat with adhesive tape to measure the pitch of the student's voice. Since the students are deaf, no consideration for aural reference is made. Data collected is presented in the context of speech, not music, and no provisions are made for pitch tracking musical instruments.
Producing an accurate static tone is a good start for a music student; however, music is the dynamic organization of sound over time. An accomplished musician needs additional skills to produce a sequence of tones (playing a melody), match a sequence of tones (playing a melody in key), produce a tone relative to a reference tone (playing an interval), produce a sequence of tones relative to a sequence of reference tones (playing in harmony), produce tones in a broad range of pitches (range), quickly varying the pitch and amplitude (vibrato and tremolo), produce tones at specific times and durations (playing in meter), and produce tones of good timbre (tone quality).
Neither Mercer or Tumblin have the display necessary to show a time history. Mercer has two pitch trackers but lacks any memory means to store the pitch data. Tumblin has music exercise data stored but only has one pitch tracker and does not store the pitch data. Tumblin uses music exercise data that must be specifically prepared for his invention. This requires the production, marketing, and distribution of music exercise data.
Pitch tracking is the dynamic determination of the fundamental frequency of an applied audio signal. Much work has been done developing the art of pitch tracking for speech recognition. Niedzwiecki and Mikiel (1976) (Hess, Wolfgang "Pitch Determination of Speech Signals" Volume 3 of Scringer Series in Information Sciences. Springer-Verlag, New York, page 175) report of a pitch tracker using a tunable low-pass filter whose cutoff is dynamically adjusted by the amplitude of the output signal. If a signal is present at the output, the cutoff frequency is lowered until the amplitude of the output goes down. Ideally the adaptive operation of this system would dynamically maintain the cutoff frequency of the filter slightly above the fundamental frequency of the applied audio signal. In addition to the reported problem of tracking performance being dependent on input signal level, it has been found through experiment that the output signal may produce noisy tracking results due to its small signal to noise ratio.
The systems of Mercer and Tumblin rely on pitch trackers that require one and only one peak per pitch cycle and an amplitude envelope that does not fluctuate rapidly. For example, when upper harmonics (overtones) of a resonant low-pitched male voice are reinforced as they fall within the frequency range of formants (the natural resonance frequencies of the vocal track), multiple peaks can occur.
A musically trained listener can detect pitch errors as small as 0.3%, a deviation of about one cycle per second for an A4 (440 Hz). The accuracy and stability of a pitch tracker is therefore very important in a music training system.
Timbre refers to the tonal quality of a musical instrument, the combinations of overtones that give each instrument its unique sound. The "nasal" quality of a voice and the "scratchy" sound of a violin are all references to timbre. Fourier analysis is one technique to quantify timbre by measuring the energy in the component frequencies of a sound source. The analysis, however, requires numerous computations and is time consuming. Nickerson et al. use a bank of 19 filters to determine spectral content of the deaf student's voice. An analog electronic implementation of such a filter bank would require many parts that occupy circuit board space, undesirable in a portable unit, and would have an impact on manufacturing time and cost. A digital implementation would require signal processing capabilities with associated speed requirements and cost. Both approaches produce an abundance of data that must be further processed in order to be interpreted. A preferred analysis technique would require few components, have a low cost, and produce results that are easy to interpret.
It is helpful for a student of music to see several notes in advance in order to plan playing technique necessary to shape musical phrases. None of the musical system mentioned display tones in advance of being heard.
Of the numerous musical instruments a student might want to learn, singing is often the most psychologically difficult for those adults who were told as children that they could not sing. These adults are often reluctant to attempt singing in front of others for fear of judgement. Singing is a skill, like reading, that needs to be developed by instruction and practice. Individual instruction is often necessary, for each student's errors and progress are unique. Typically vocal instruction requires finding a music teacher, arranging a visitation schedule, paying for the classes, and maintaining regular attendance. These factors can discourage potential music students from pursuing instruction. An ideal music instructor would be available anytime, anywhere, would have infinite patience, be consistently accurate, non-judgmental, could be shared among several people for no additional cost, provide instruction on any of a thousand popular songs, show exactly where a student's errors are, and comply with the interests and pace of each individual student.
It can be seen, therefore, that a need exists for a music training apparatus that can provide a student with an accurate temporal visual record of aural parameters of their musical performance and of a musical reference.