The graphical user interface (“GUI”) is a computer interface that uses graphic icons and controls in addition to text to provide interaction between a user and a computer. A user of the computer utilizes a keyboard, a pointing device, e.g., a mouse, to manipulate the icons and controls. A user interacts through the GUI with the hardware and software of the computer system to cause the computer to perform actions, e.g., to create, manipulate, or modify various signals. With the increasing use of multimedia as part of the GUI, sound, voice, motion video, and virtual reality interfaces become a part of the GUI for many applications. One of such multimedia activity relates to audio signals. The audio signals may be produced and modified as desired to create audio performances, soundtracks, special effects, and the like. For example, GarageBand (Trademark) produced by Apple Inc., uses sampled real musical instruments and synthesized instruments to create or modify a piece of music.
The audio signals, or sound may be in digital or in analog data format. The analog data format is normally electrical, wherein a voltage level represents the air pressure waveform of the sound. A digital data format expresses the air pressure waveform as a sequence of symbols, usually binary numbers. The audio signals presented in analog or in digital format may be processed for various purposes, for example, to correct timing of the audio signals. Present methods to correct timing, however, require knowledge of an exact original location in time of the audio signal, meaning that the present methods operate with discrete audio events, having the original position in time already defined.
In current graphical user interfaces, to correct timing of a discrete audio event, a user manually moves a discrete audio event from the original time to a designated time on a grid, like it is performed in the Musical Instrument Digital Interface (“MIDI”) protocol. FIG. 1 illustrates a typical prior art method of aligning a discrete audio event to a designated time on a grid. As shown in FIG. 1, the audio signal produced by a user is graphically represented on a display 100 as a sequence of discrete peaks (“audio events”) 104, 105, and 106. The audio signal is not stored in a notational format; rather it is stored as a waveform. The discrete audio events produced by the user may be also graphically represented on a screen as a note over a staff. As shown in FIG. 1, in the original audio signal the event 104 and the event 106 are aligned to respective designated times 101 and 108 on the grid 102, whereas the event 105 is originally shifted from the designated time 107. The event 105 on an audio recording may represent a musician playing a note too soon, and it may be desired to correct this when playing back the recording. The user has to manually align each of the shifted events in the original audio signal to respective designated times on the grid 102. First, the user has to compare the position of the discrete audio event 105 relative the grid 102. Next, the user visually needs to determine that the audio event 105 is shifted relative to the designated time 107 on the grid 102. Next, the user needs to select each of the shifted audio events, for example, by click of a mouse, and then align each of the shifted audio events to the respective designated times by, for example, dragging the event 105 with a cursor 104 to align with the designated time 107, as shown in FIG. 1. Further, to produce a sound, a sound is triggered at the designated time 107, which is different from the original time that results in the sound played at a faster or slower speed depending on the original position of the event 105 relative to the designated time 107. Not only is the manual alignment process inconvenient, the mere moving of the audio event 105 to the designated time 107 on the grid 102 may cause undesirable side effects in playback of recorded audio, for example, pitch variations, clicks, and pops.