Musical instruments are manipulated by a musician-user to articulate and produce notes of discrete tones having constant pitch. For example, a guitar has strings and a finger board with frets that divide the guitar strings into discrete lengths. By pressing a string against a fret with a finger, a musician can select equally spaced pitches. Each musical note need not be articulated separately. Some notes may be played merely by changing the pitch of a previously articulated note, and pitch may frequently be altered to glide continuously between notes.
To more fully appreciate the present invention, it is helpful to define several terms commonly used by professional musicians. xe2x80x9cPitchxe2x80x9d is the subjective sensation produced by a periodic vibration having constant frequency, e.g., what may be produced by a picked (or plucked) guitar string that is under tension. The sensation of pitch is logarithmically related to the frequency of vibration. In music, discrete pitches are referred to as xe2x80x9cnotesxe2x80x9d.
The xe2x80x9ctonexe2x80x9d or xe2x80x9ctimbrexe2x80x9d of a musical note includes several components, and gives the note a distinctive character. The same note played on a  an organ will have a different timbre than if played on a piano. Indeed, the same note played on two pianos can exhibit different nuances of timbre, and will sound different. The timbre of an instrument playing a note is closely related to the shape of the periodic wave creating the note, which is referred to as waveshape. As such, one aspect of timbre is the manner in which the note waveshape changes with time. Timbre is often analyzed in terms of the instantaneous Fourier series components of a tone. Ordinarily these components change over time, corresponding to the change in waveshape. There are also non-harmonic components of timbre including, for example, breath noise admixed with the simple, round tone of a flute.
A listener can distinguish different instrumental timbres, such as a note from an oboe and a note from a violin. A critical aspect enabling the differentiation is the complex change that occurs within the first few milliseconds of the onset of the note (known as the xe2x80x9cattackxe2x80x9d time). This aspect is related to how notes are produced (or xe2x80x9carticulatedxe2x80x9d) on different instruments. For example, a violinist articulates notes by bowing, to produce a scraping sound and a sense of the note growing from nothing, whereas a saxophonist tongues notes.
Tone synthesizers use control envelopes to simulate the interactions of pitch, volume and timbre. Control envelopes may be characterized by four associated parameters, namely attack, decay, sustain and release.
The xe2x80x9cdistancexe2x80x9d in pitch between two notes is termed the musical xe2x80x9cintervalxe2x80x9d. In the familiar xe2x80x9cdo-re-mixe2x80x9d scale, each syllable represents an interval or distance from the first note of the scale. The creation of many musical effects involves altering the pitch of a previously articulated note, and may require production in a manner different than the articulation of musical notes. Musical effects can include slurs, hammer-ons, pull-offs, blues inflections, glissandos, portamento, and vibrato. Such gestures are usually performed to move from one note of a scale to an adjacent note, or in between, the movement being referred to as the distance of a gesture.
xe2x80x9cBlues inflectionxe2x80x9d is the musical term for a voice-like pitch bend gesture, and usually includes xe2x80x9cblue notexe2x80x9d pitches that are located between pitches found in standard musical scales. Most popular American music incorporates blues-type note inflections in some way, and thus a mechanism for bending notes is necessary for an electronic synthesizer used to play such music.
Professional grade musical synthesizers routinely provide some sort of pitch bend mechanism for performing continuous pitch changes, such as over a major second interval as might be produced by a guitar player by bending strings. The most common type of pitch altering device is the pitch wheel. The pitch wheel may be a user-operated biased rotary wheel with a center detente, or a rotary wheel with a dead zone in the middle. Other control mechanisms for altering pitch include spring-biased levers and joysticks. Unfortunately, such mechanisms do not generally produce lifelike pitch inflections.
On a guitar, blues inflection is performed by deflecting a guitar string sideways (laterally) along the fret after the string has been picked (e.g., caused to vibrate) by the musician. This lateral movement increases the tension on the vibrating string, and hence increases the pitch. Similarly, a skilled saxophonist may bend pitch using air pressure or lip pressure on the playing reed.
A xe2x80x9cslidexe2x80x9d is performed on a guitar by sliding the finger up or down the finger-board (longitudinally), after an initial note is picked. A similar effect, referred to as xe2x80x9cglissandoxe2x80x9d, may be produced on a piano by dragging a thumb back and forth across the keys while pressing down, to cause each key to sound quickly in succession. The result is a series of additional discrete sound pitches. On a harp, such a glissando may be performed by dragging the hand across the strings, so that each sounds in succession. A similar gesture on a guitar is called a xe2x80x9cstrumxe2x80x9d, one difference being that the notes which sound on a guitar are each spaced by several semitones, making a chord. On a harp, chords may be produced by using pedals to retune the strings.
When the hand is then dragged across the strings as when performing a glissando, the result is called an xe2x80x9carpeggioxe2x80x9d. Apreggios usually traverse a wide range of dozens of notes, while a guitar strum is always six notes or less. A guitar chord may also be xe2x80x9cpluckedxe2x80x9d as may a chord on a harp, while a chord on a piano is xe2x80x9cstruckxe2x80x9d. Arpeggios may also be played on a piano by striking the notes of a chord in succession while alternately displacing each hand to cover a wide range of notes. However, this produces a significantly different effect than a harp arpeggio because it requires discrete motions of the fingers used to strike each note.
Some musical notes may be played by slurring a previously articulated note. xe2x80x9cSlurringxe2x80x9d means a musician does not produce every note anew but may instead continue from one note to the next without re-attacking, thus changing only the pitch. For example, on a saxophone, notes are slurred by opening or closing additional valves on the body of the instrument, while continuously blowing. This changes the pitch and creates a different note, but without a tongued articulation. On a violin, slurring involves placing fingers in front of or behind other fingers on the fingerboard while continuing to bow, to shorten or lengthen the effective vibrating length of the string. This changes the pitch and creates a different note, but without a bowed articulation. Guitarists slur notes similarly as violinists, but the gesture of placing a first finger on a fret in front of a second finger is called a xe2x80x9chammer-onxe2x80x9d. The gesture of placing a second finger behind the first and then releasing the first is called a xe2x80x9cpull-offxe2x80x9d. Playing whole phrases by articulating only the first note and slurring the remaining notes is called xe2x80x9clegatoxe2x80x9d.
In real life, gestures may be combined, either sequentially or simultaneously. Thus a skilled guitarist can perform a hammer-on followed by a glissando, or simultaneously achieve blues inflection and vibrato. A number of hammer-ons in quick succession produces a xe2x80x9ctrillxe2x80x9d. A series of connected gestures traversing a large interval in a single direction is called a xe2x80x9crunxe2x80x9d. A combination of gestures involving one or more changes of direction may be referred to as a xe2x80x9clickxe2x80x9d or a xe2x80x9criffxe2x80x9d. These gestures are so named because each is perceivable as an individual musical event. Their separate elements become fused because there is an overarching time shape to them. This time shape or trajectory may result from a continuous change in the times between expected onset of the component gestures.
In addition, guitarists and other musicians may perform gestures within gestures. That is, they may alter the apparent course of a gesture as it is performed by making slight variations. Such subtleties may distinguish one player""s style from another, or even a great performance from a poor one.
Electronic musical instruments such as synthesizers are known in the art, and often include various sound generating routines that are executed by a digital signal processor, or the like. Such synthesizers are realized using digital oscillators to produce tones using cyclic time vectors and wavetables. Typically a mechanism is provided for selecting discrete pitches, such as a piano-like keyboard, and at least one mechanism for altering pitch continuously. Historically, such pitch altering or bending mechanisms have been primarily mechanical in nature, and in addition to pitch wheels and joysticks mentioned above, included levers, ribbons, and various modifications to piano keyboards. But such prior art pitch altering mechanisms permit only a single type of gesture, and the sounds they produce do not include very expressive gestures. In general, they do not permit combining gestures, or alteration of the gesture as it is performed.
In addition, no means of strumming chords is provided. A strum may be simulated on a piano-like keyboard by staggering the notes of a chord as they are struck. However, this does not produce a realistic sounding strum because it requires coordinating several discrete gestures by the player""s fingers. A guitar strum, or harp glissando is executed by a single motion with the arm.
Gesture mapping gloves and touch responsive membranes have been used to try to provide more flexible pitch alteration. But even these more complex mechanisms produce sounds that are less expressive than desired because acoustic instruments (and the human voice) require some exertion to produce musically useful results. It is known to also provide exertion-requiring mechanism using force-feedback based on linear motors or high-tension springs. But such mechanisms are expensive, bulky, and difficult to use, and require special circuitry to interface with electronic sound-production systems. They also suffer from the same limitations as standard mechanical operators.
At best, the prior art has attempted to bend pitch in a single direction. Thus with respect to producing the sound of a plucked guitar string, one can attempt to emulate a guitarist""s transverse  lateral deflection (blues inflection) of a plucked string, or perhaps a lateral  longitudinal deflection (glissando), but not both transverse  lateral and lateral  longitudinal gestures. Interestingly, the prior art sometimes suggests that pitch is a non-critical parameter with which to impart expression in one direction. By implication, the prior art would regard as futile attempts to bend pitch in more than one direction.
It is known in the prior art to assist the simulation of gestures using electronics. Thus, some synthesizers create slurs using a so-called xe2x80x9cmono modexe2x80x9d operating state in which only one note may be played at a time. If a second note is played before releasing the previous note, the second note continues the first note with only a change in pitch. Further, the envelopes used to create the first note continue rather than beginning anew from the attack. If a note is played after all previous notes have been released, the new note is re-attacked. Unfortunately, mono mode does not create effective slurs because on real musical instruments pitch changes that create slurs do not occur suddenly, but have a characteristic pitch change curve. In U.S. Pat. No. 5,216,189, to Kato (1993), this problem was somewhat addressed using preset curves to create the slur, where the above-described fingering scheme was retained.
xe2x80x9cPortamentoxe2x80x9d or xe2x80x9cpitch glidexe2x80x9d is another gesture, in which a continuous gliding movement from one musical tone to another is produced without rearticulation. For example, a trombonist moves the trombone slide in and out while continuing to blow; a violinist slides one finger up or down the fingerboard while continuing to bow. (Technically, the gesture that creates portamento on a violin creates a glissando on a guitar because a guitar has frets that cause discrete pitches to be produced as each fret is crossed, whereas pitch varies continuously for a violin.)
Many attempts have been made in the prior art to create portamento on electronic sound synthesizers, typically using a pre-programmed function. In some synthesizers, portamento effect circuitry is activated to automatically cause each note to slide to the next note over a period of time. In some implementations, so-called fingered portamento is created by playing the second note before releasing the first, analogously to slurring. Functions used to create such synthesized portamento or pitch glide typically are characterized by an exponential curve. While such curves produce a recognizable effect peculiar to synthesizers, they do not realistically duplicate natural sounding portamentos performed on actual musical instruments.
One device called the Oberheim Strummer was designed to produce realistic guitar strums for use with electronic instruments. This device had prerecorded strum templates activated by pressing a button. While such an implementation produces realistic single strums, they always sound the same. The speed of the strum can not be altered, nor varied as it is performed.
Another device known in the art is referred to as an arpeggiator. This works, as the name suggests, by activating the notes of a chord held down on a piano-like keyboard rapidly in succesion succession. The notes are evenly spaced and therefore do not have the overarching time characteristic that gives real arpeggios their lifelike perceptual quality. The result is an automated computer music effect.
Some synthesizers produce portamento by actually performing the gesture using a ribbon controller comprising a position-sensing membrane whose length may be a few inches to three feet or so. While such configurations can produce relatively realistic portamentos, the portamento is of a single characteristic type. U.S. Pat. No. 5,241,126 to Usa et al. (1993) discloses the use of transfer functions transitioning from ribbon position to pitch, to try to produce a higher quality portamento. Often the transfer function has a stair-like step characteristic, that upon performance seeks to produce a glissando-like gesture.
xe2x80x9cVibratoxe2x80x9d is the undulating variation that creates tension when sustaining a single tone for a period of time, and may be produced by most bowed-string and woodwind classical instruments. On prior art synthesizers, vibrato may be implemented using a cyclic low frequency waveform that automatically varies (or frequency modulates) the pitch of a note. The amount of frequency modulation produced may be controlled with a continuous modulation wheel, and may be developed gradually using a control envelope or some other mechanism, such as a pedal or a pressure pad. Unfortunately the vibrato effect produced inevitably sounds automatic, and not lifelike.
Some prior art synthesizers include tables of pre-drawn transfer function curves, but do not provide a means of altering the predrawn-curves during performance. For instance, U.S. Pat. No. 5,241,126 to Usa et al.(1993), referred to above, seeks to bridge the gap between mechanical operators and electronic tone generation with a mechanism that uses a continuous operator along which gestures can be mapped. Unfortunately, Usa""s disclosed mechanical operator is sophisticated mechanically and thus expensive to manufacture. Significantly, Usa fails to adequately specify means of simulating a variety of realistic gesture performance techniques. The few gesture maps specified are simplistic and representative of easily modeled gestures, e.g. using step functions to perform scales, arpeggios, or glissandos. Unfortunately, the output from Usa""s device is not MIDI-compatible, and cannot be exported for use by other MIDI-compatible systems.
MIDI refers to a standard data communications protocol for electronic music equipment. In the MIDI specification, bytes of data are coded to be recognized as representing musical functions such as note on and note off of specific keys on a piano keyboard. Certain bytes are coded and specified to represent ranges of digital numbers for use modifying continuously variable parameters of musical tones. Ranges of numbers are specified for pitch bend and overall volume. All MIDI synthesizers recognize these numbers, and most recognize additional numbers specified for general use, that may be used to modify synthesizer functions controlling various aspects of timbre.
Having discussed the nature of various musical effects that are desired from a musical synthesizer, it is now useful to examine how such synthesizers are configured. FIG. 1 is a generic block diagram of a prior art musical synthesizer 10, and is similar to the commercially available Ensoniq model SD-1. Synthesizer 10 includes a discrete pitch generator 20 that outputs chosen discrete pitches in response to the output from a user input signal, typically provided by a piano keyboard 30. Each discrete pitch is presented by generator 20 as a digital number.
Synthesizer 10 further includes an interface module 40 that contains menu-selectable programming parameters, and is responsive to user-input values from controls 50, 50xe2x80x2 (respectively xe2x80x9cOP1xe2x80x9d and xe2x80x9cOP2xe2x80x9d), which may include wheels, foot pedals, a pressure pad, or the like. It is a function of module 40 to modify tone synthesis parameters for use by tone synthesizer 60. At best, however, what result is an unrealistic pitch bend sound that is quite unlike the pitch inflection actually produced by a musician using a traditional musical instrument.
Module 40 further includes a scaler 70 whose output is a scaled digital number proportional to the output from the first operator control 50. This user-determined digital ouput  number is then input to a look-up table 80 that typically performs a waveshaping and amplitude normalizing function. The result from look-up table 80 is then combined by a  an adder 90 with a digital number representing the output from the second user control 50xe2x80x2.
Summer 90 outputs a digital number representing the combined effects of operator controls 50 and 50xe2x80x2. The partial sum output by summer 90 is then combined by a second adder 100 with the digital number representing the output of the discrete pitch generator 20. It will be appreciated that adder 100 may represent any means of combining discrete pitch data from discrete pitch generator 20, and continuous pitch data from adder 90, including merging of MIDI data within tone generator 60. The grand output provided by adder 100 is a digital number or numbers that can control the output of tone generator 60. As shown in FIG. 1, synthesizer system 10 may instead provide the output from the first adder 90 as an input to the tone generator 60, without digital summation using adder 100. In either embodiment, the output from tone synthesizer 60 is input to an audio reproduction system, e.g., loudspeaker 110.
Unfortunately the musical effects provided by prior art synthesizer 10 are not especially realistic. At best there is a crude attempt to bend pitch. For example the scaler 70 cannot effectively control musical intervals because the parameters specified are not scale-tone divisions. Further, synthesizer 10 cannot provide effectively controlled musical intervals because scaler 70 precedes (rather than follows) look-up table 80. At best, scaler 70 can affect, but in an unpredictable fashion, the extent of waveshaping by the lookup table 80. Further, there is no means provided of modifying gesture trajectory in real time.
There is a need for a gesture synthesizer permitting realistic simulations of various musical gestures in any sequence for an electronic musical instrument, including digital synthesizers. Preferably such a gesture synthesizer should realistically alter pitch, tone volume, and tone timbre. Further, such a gesture synthesizer should be modular in use and in fabrication, and should be implemented using conventional components and techniques. Finally, pitch bend and other gestures provided by such a gesture synthesizer should be MIDI-compatible and MIDI-exportable.
The present invention discloses such a gesture synthesizer.
The present invention provides a musical synthesizer with a gesture synthesizer that modifies musical gestures. Applicant has recognized that conventional tone synthesizers reproduce the sensation of tone created by an acoustic instrument by modeling human aural perception processes, and representing these processes parametrically. In contrast to the prior art however, the present invention recognizes that a missing musical parameter or parameters may be defined, representing a perceptual quality or qualities, related to the trajectory of a gesture and to the continuous variation of its trajectory. Gestures are controlled using feedback loops between the eyes and ears of a performer and their muscles. Additionally receptors in muscles transmit information about position, speed, and acceleration of gestures back to the higher brain centers involved in muscle activation and control. Such processes allow performers to formulate and continuously modify gesture trajectories in very precise ways.
Expressive real time performance involves microstructural gesture variations. Musical events occur in time sequentially as rhythm and meter, and simultaneously as chords, and harmonies. As music is traditionally notated, these aspects of time are represented graphically in two corresponding dimensions. Sequential events are represented horizontally, while simultaneous events are represented vertically. In traditional music notation, emotional expression is suggested with conventional terms and graphic symbols, but is for the most part left to the performer""s interpretation. Such interpretation may be said to represent a third aspect of time, involving perceived possible outcomes of the various gesture trajectories used to perform the music. That is, each gesture follows a more or less predictable course, as does, for example, a fly ball. However gestures are not limited to a single path, but are typically modified continuously. This creates a plane of possible future trajectories, subject to the physical limitations of the instrument and the musician""s muscles. To create interest in the music the musician manipulates these possibilities. The listener""s expectations are continuously resolved, modified, or denied, conveying various sensibilities, such as repose or surprise.
A musician using an actual musical instrument mentally conceives a gesture that may be said to have a xe2x80x9cvirtual trajectoryxe2x80x9d. Virtual trajectory includes a preplanned distance, time, and curvature, or acceleration-deceleration characteristic. It also includes time dependent variables that affect the final position of a projected motion. Virtual trajectory may be overcome by external forces, so the final outcome may be different.
To thus effect the performance of gestures, a musician flexes his or her muscles in a continuously controlled manner. Consequently, applicant has discovered that more realistic sound may be produced by providing the synthesizer with a perceptual model of muscles and their activation. Such models allow for continuous modification within a parametrically defined space. They may include static elements of muscles and their loads, such as friction and elasticity, as well as time dependent elements such as viscosity and inertia. Also included may be a controlling element as well as an activating one, representative of the roles of activation and control played by muscle pairs, and an additional direction dependent element likewise representative of the bifurcation of muscle pairs.
For instance, analogous to the manner in which a guitarist exerts muscular force against a tensioned string load, the gesture synthesizer models human muscle movements including visco-elastic properties of muscle pairs and the elasticity a simulated load. A keyboard input device selects discrete pitches in conventional fashion and may be operated with one hand, while the additional user control operator devices permit user-activation and control of desired gestures with the other hand. The controllers or operators activate simulated gestures used to modify pitch and other aspects of musical tones.
In a first embodiment, the gesture synthesizer provides a dynamic model of human muscles based upon Hill""s velocity-force equation to describe the non-linear properties of muscle motion when interacting with loads. A second embodiment models the cyclic oscillation produced by opposing effects of two force sources representing the cyclic oppositional action of muscle systems. A third embodiment provides the gesture synthesizer with an emulation of the response of muscles to internal electrical impulses. A fourth embodiment provides a mechanism representing and altering the virtual trajectory of gestures that appear to originate with the higher brain functions governing muscle control. A fifth embodiment provides a mechanism modelling visco-elastic properties of muscle pairs as well as frictional, inertial and elastic properties of simulated loads.
A gesture synthesizer according to the present invention permits performance of several gestures simultaneously or sequentially in any order, and permits at least one gesture simulation parameter to be modified in real time during performance. Preferably a parametric menu-driven interface permits user-specification of gesture types by selection of synthesis functions and non-real time modification of programming arguments.
The gesture synthesizer according to the present invention preferably is implemented with a modular architecture in which data from user controllers or independent time operators (e.g., clock sources) is specified as control data or modulation data. Control data is processed by a series of gesture synthesis modules, whereas modulation data from user modulation operators may interact with and modify control data in real-time. Data from one gesture synthesis module may be used as modulation data, and as such may interact with and modify control data in another module. In another embodiment, a generalized architecture may be implemented incorporating the functions of several modules.
User controller output values provided to the gesture synthesizer may be delayed, scaled, and/or subjected to hysteresis and/or be subjected to salience (e.g. curvature) modulation, be used to generate and modify data from a clock, be subjected to filtering, or any combination therof thereof, before being input to variable shape look-up tables. The look-up table outputs may themselves be subjected to filtering, used to generate and modify clock data, be subject to salience modulation, hysteresis, scaling, delay, or any combination thereof before being combined in an output adder. The output from another user controller can vary the first and/or second scaling, delay, modulation, clock data generation, filtering or shaping in real-time to vary the effect of the gesture. A synthesizer according to the present invention provides controller values suitable to activate notes, or modify tone pitch, volume, timbre, or any combination thereof, and that are MIDI-compatible and MIDI-exportable.
Data generated by such a gesture synthesizer may also be used to control and vary parameters of a digitally represented physical instrument model. Such physical modeling parameters will naturally also control the tone pitch, volume, and timbre of a resulting musical note, since these are perceptual attributes independent of the method used to generate a tone. However, the parameters of a physical model may not be specified to control these attributes in the same way as they are traditionally represented in an electronic music synthesizer.
In addition to natural sounding strums, data generated by such a gesture synthesizer may be used to control the tempo of an arpgeggiator  arpeggiator or note sequencer, thus imparting an overarching time modification that has a natural sounding effect, like rubato. Such modifications may be performed in real time to fit the musical context. Likewise overall volume effects may be imparted to sequences and arpeggios by performing them with the gesture synthesizer.
Combinations of gestures may also be performed. For example, a violin bow is used to excite the string into vibration and also to controls  control it""s volume and timbre. Similarly, the gesture synthesizer may be used to both trigger a note and vary it""s volume and/or timbre, after it is first selected using a keyboard input device.
Other features and advantages of the invention will appear from the following description in which the preferred embodiments have been set forth in detail, in conjunction with the accompanying drawings.