Musical Instrument Digital Interface (MIDI) is widely used as an interface for controlling performance of instruments in which control data is supplied to a sound source storing instrumental tones to thereby generate performed tones of instruments from the sound source. At present, MIDI is taken as the standard interface for externally controlling electronic instruments.
A MIDI signal represents a digitally encoded performance parameter of a corresponding electronic instrument or the like, and a performance can be corrected by correcting a code even after encoding. Recording, editing, and reproduction of a MIDI signal is carried out with a sequencer or a sequencer software, and MIDI signals are treated in form of a MIDI file.
Also, a standard MIDI file (SMF) is known as the unified standard for maintaining compatibility between different sequencers or different kinds of sequencer software. The SMF is composed of data units called “chunks”. The “chunks” define data pieces called a header chunk and a track chunk. The header chunk is set at the top of a SMF file and describes basic information concerning the data in the file. The track chunk is composed of time information (Delta-Time) and events. The event represents an action, event or the like which will change any of the items of the data file. Events of MIDI file data formatted in the SMF format are roughly classified into three types of events, i.e., MIDI events, SysEx events (system exclusive events), and Meta events.
MIDI events directly express performance data. SysEx events mainly express system exclusive messages for MIDI. System exclusive messages are used to exchange information peculiar to a specific instrument and to transmit special non-musical information, event information, and the like. Meta events express additive information such as information indicating tempo, time, and the like concerning the entirety of a performance, information including words of a song used by sequencer software, or copyright information. Every Meta event begins with 0×FF which is followed by a byte representing the event type, and the data length and data itself further follow. The MIDI performance program is designed so as to ignore those Meta events that cannot be recognized by the program itself.
Each event is added with timing information concerning the timing when the event is executed. The timing information is represented as a time difference from execution of a previous event immediately before the present event. For example, if the timing information is “0”, the present event added with this information is executed at the same time when the previous event is executed.
In general, music reproduction using the MIDI standard adopts a system in which various signals and tones peculiar to instruments are modeled, and a sound source which stores the data of the modeling is controlled by various parameters. Therefore, it is difficult to express those sounds that are difficult to model or that have not yet been studied sufficiently, such as human voices and natural sounds.
Consequently, reproduction of music according to the MIDI standard is limited at most to performance of musical instruments and the like, but cannot cover singing voices and the like.
Hence, demands have appeared for a synchronous reproduction technique for synchronously reproducing audio signals such as human voices which are not performed tones, and performed tones based on MIDI signals together.
Although there has been a system like certain sequencer software which synchronizes performed tones based on MIDI signals with audio signals such as vocals. Synchronous reproduction described above has been so complicated and less extendable that it can be achieved only with such sequencer software.
Not only the above-mentioned audio signals which are not performed tones but also image signals and text signals may be considered as objects to be synchronized with MIDI signals. Expansion to unified media has thus been expected. In addition, data transmission via networks have been carried out frequently in recent years, and the unified media may naturally be subjected to such data transmission. Therefore, systems via networks require a technique which enables synchronous reproduction as described above, easy operation, and high extendibility. Also desirable is a technique capable of easily correcting data as in the case of MIDI data.