1. Field of the Invention
The present invention relates to a method and system for synchronizing multiple signals received through different transmission mediums.
2. Description of the Prior Art
Synchronization systems are known in the prior art. The following eleven (11) patents and published patent applications are the closest prior art known to the inventor which are relevant to the present invention.
1. U.S. Pat. No. 6,067,566 issued to William A. Moline on May 23, 2000 for “Methods And Apparatus For Distributing Live Performances On Midi Devices Via A Non-Real-Time Network Protocol” (hereafter the “Moline Patent”);
2. U.S. Pat. No. 6,462,264 issued to Carl Elam on Oct. 8, 2002 for “Method And Apparatus For Audio Broadcast Of Enhanced Musical Instrument Digital Interface (Midi) Data Formats For Control Of A Sound Generation To Create Music, Lyrics And Speech” (hereafter the “Elam Patent”);
3. U.S. Pat. No. 6,710,815 issued to James A. Billmaier et al. on Mar. 23, 2004 for “Synchronizing Multiple Signals Received Through Different Transmission Mediums” (hereafter the “Billmaier Patent”);
4. U.S. Pat. No. 6,801,944 issued to Satour Motoyama et al. on Oct. 5, 2004 for “User Dependent Control Of The Transmission Of Image And Sound Data In A Client-Server System” (hereafter the “Motoyama Patent”);
5. U.S. Pat. No. 6,891,822 issued to Ralugopal R. Gubbi et al. on May 10, 2005 for “Method And Apparatus For Transferring Isocronous Data Within A Wireless Computer Network” (hereafter the “Gubbi Patent”);
6. U.S. Pat. No. 6,953,887 issued to Yoichi Nagashima et al. on Oct. 11, 2005 for “Session Apparatus, Control, Method Therefor, And Program For Implementing The Control Method” (hereafter the “Nagashima Patent”);
7. United States Published Patent Application No. 2006/0002681 issued to Michael Spilo et al. on Jan. 5, 2006 for “Method And System For Synchronization Of Digital Media Playback” (hereafter the “Spilo Published Patent Application”);
8. United States Published Patent Application No. 2006/0007943 issued to Ronald D. Fellman on Jan. 12, 2006 for “Method And System For Providing Site Independent Real-Time Multimedia Transport Over Packet-Switched Networks” (hereafter the “Fellman Published Patent Application”);
9. U.S. Pat. No. 7,050,462 issued to Shigeo Tsunoda et al. on May 23, 2006 for “Real Time Communication Of Musical Tone Information” (hereafter the “'462 Tsunoda Patent”);
10. United States Published Patent Application No. 2006/123976 issued to Christopher Both et al. on Jun. 15, 2006 for “System And Method For Video Assisted Music Instrument Collaboration Over Distance” (hereafter the “Both Published Patent Application”);
11. U.S. Pat. No. 7,072,362 issued to Shigeo Tsunoda et al. on Jul. 4, 2006 for “Real Time Communications Of Musical Tone Information” (hereafter the “'362 Tsunoda Patent”).
12. United States Published Patent Application 2007/0140510 issued to William G. Redmann et. al. on Jun. 21, 2007 for “Method And Apparatus For Remote Real Time Collaborative Acoustic Performance And Recording Thereof” (hereafter the “Redmann Published Patent Application”)
The Moline Patent is a method and apparatus for distributing live performances on MIDI devices via a non-real time network protocol. Techniques for distributing MIDI tracks across a network using non-real-time protocols such as TCP/IP. Included are techniques for producing MIDI tracks from MIDI streams as the MIDI streams are themselves produced and distributing the MIDI tracks across the network, techniques for dealing with the varying delays involved in the distributing the tracks using non-real-time protocols, and techniques for saving the controller state of MIDI track so that a user may begin playing the track at any point during its distribution across the network. Network services based on these techniques include distribution of continuous tracks of MIDI music for applications such as background music, distribution of live recitals via the network, and participatory music making on the network ranging from permitting the user to “play along” through network jam sessions to using the network as a distributed recording studio.
The detailed description of a preferred embodiment of the invention begins with an overview of the invention and then provides more detailed disclosure of the components of the preferred embodiment.
What is termed herein live MIDI is the distribution of a MIDI track from a server to one or more clients using a non-real-time protocol and the playing of the MIDI track by the clients as the track is being distributed. One use of live MIDI is to “broadcast” recitals given on MIDI devices as they occur. In this use, the MIDI stream produced during the recital is transformed into a MIDI track as it is being produced and the MIDI track is distributed to clients, again as it is produced, so that the clients are able to play the MIDI track as the MIDI stream is produced during the recital. The techniques used to implement live MIDI are related to techniques disclosed in the parent of the present patent application for reading a MIDI track 105 as it is received. These techniques, and related techniques for generating a MIDI track from a MIDI stream as the MIDI stream is received in a MIDI sequencer are employed to receive the MIDI stream, produce a MIDI track from it, distribute the track using the non-real-time protocol, and play the track as it is received to produce a MIDI stream. The varying delays characteristic of transmissions employing non real-time protocols are dealt with by waiting to begin playing the track in the client until enough of the track has been received that the time required to play the received track will be longer than the greatest delay anticipated in the transmission. Other aspects of the techniques permit a listener to being listening to the track at points other than the beginning of the track, and permit use of the non-real-time protocol for real-time collaboration among musicians playing MIDI devices.
The Elam Patent is a method and apparatus for audio broadcast of enhanced musical instrument digital interface (MIDI) data formats for control of a sound generator to create music, lyrics and speech. It specifically involves a method and apparatus for the transmission and reception of broadcasted instrumental music, vocal music, and speech using digital techniques. The data is structured in a manner similar to the current standards for MIDI data.
The Billmaier Patent which issued in 2004 is for synchronizing multiple signals received through different transmission mediums. Multiple signals received through different transmission mediums are synchronized within a set top box (STB) for subsequent mixing and presentation. Specifically, “FIG. 5 is a block diagram of various logical components of a system 500 for synchronizing a primary signal 402 with a secondary signal 404. The depicted logical components may be implemented using one or more of the physical components shown in FIG. 3. Additionally, or in the alternative, various logical components may be implemented as software modules stored in the memory 306 and/or storage device 310 and executed by the CPU 312.
In the depicted embodiment, a primary signal interception component 502 intercepts a primary signal 402 as it is received from the head-end 108. The primary signal interception component 502 may utilize, for example, the network interface 302 of FIG. 3 to receive the primary signal 402 from the head-end 108. The primary signal 402 may include encoded television signals, streaming audio, streaming video, flash animation, graphics, text, or other forms of content.
Concurrently, a secondary signal interception component 508 intercepts the secondary signal 404 as it is received from the head-end 108. As with the primary signal 402, the secondary signal 404 may include encoded television signals, streaming audio, steaming video, flash animation, graphics, text, or other forms of content. In one embodiment, the signal interception components 502, 508 are logical sub-components of a single physical component or software program.
Due to the factors noted above, reception of the secondary signal 404 may be delayed by several seconds with respect to the primary signal 402. Thus, if the secondary signal 404 were simply mixed with the unsynchronized primary signal 402, the results would be undesirable because the two are not synchronized.
Accordingly, a synchronization component 512 is provided to synchronize the primary signal 402 with the secondary signal 404. As illustrated, the synchronization component 512 may include or make use of a buffering component 514 to buffer the primary signal 402 for a period of time approximately equal to the relative transmission delay between the two signals 402, 404. As explained in greater detail below, the buffering period may be preselected, user-adjustable, and/or calculated.”
Therefore, this invention discloses the concepts of synchronizing signals although they are not talking about more than two in this particular disclosure.
The Motoyama Patent is a user dependent control of the transmission of image and sound data in a client-server system. Specifically this patent discloses:
“Each user can select the rank in accordance with the performance of the client of the user, the degree of services to receive, an available amount of money paid to data reception, and the like. The rank is assigned to each user ID. The proxy server checks the rank form the user ID so that data matching the user rank can be supplied.
Each proxy server can detect its own load and line conditions. The main proxy server assigns each client a proxy server in accordance with the load and line conditions of each proxy server. A user can receive data from a proxy server having a light load and good line conditions so that a congested traffic of communications can be avoided and a communications delay can be reduced.
The main proxy server may detect a problem such as a failure to each proxy server in addition to the load and line conditions to change the connection of clients in accordance with the detected results. Even if some proxy server has a problem, this problem can be remedied by another proxy server.
When accessed by a client, the main proxy server 12 may assign the client any one of plurality of mirror servers 13. In this case, one of the mirror servers 13 transmits data to the client and the main proxy server 12 is not necessary to transmit data.
In the network shown in FIG. 1, the main server 7 is not always necessary. If the main server 7 is not used, the proxy server 12 or 13 becomes a server and which is not necessarily required to have a proxy function. In this case, the proxy servers 12 and 13 are not different from a general main server.”
The Gubbi Patent is a method and apparatus for transferring isocronous data within a wireless computer network. It discloses:
“Also shown in FIG. 3 is an audio information buffer 74, which may also be a portion of memory 62 or one or more registers of processor 60. The audio information buffer 60 has several configurable thresholds, including an acute underflow threshold 76, a low threshold 78, a normal threshold 80, a high threshold 82 and an acute overflow threshold 84. The audio information buffer 74 is used in connection with the transfer of audio information from server 12 to the client unit 26 as follows.
In general, NIC 14 receives an audio stream from the host microprocessor 16 and, using the audio compression block 36, encodes and compresses that audio stream prior to transmission to the client unit 26. In one example, ADPCM coding may be used to provide a 4:1 compression ration. After transmission, client unit 26 may decompress and decode the audio information (e.g., using audio decompression unit 66) prior to playing out the audio stream to television 32. So, in order to ensure that these streams are synchronized, the audio information is time stamped at NIC 14 with respect to the corresponding video frame. This time stamp is meant to indicate the time at which the audio should be played out relative to the video. Then, at the client unit 26, the audio information is played out according to the time stamp so as to maintain synchronization (at least within a specified tolerance, say 3 frames).
Because, however, the host microprocessor 16 is unaware of this time stamping and synchronization scheme, a flow control mechanism must be established to ensure that sufficient audio information buffer 74, the client unit 26 can report back to the server 12 the status of available audio information. For example, ideally, the client unit 26 will want to maintain sufficient audio packets on hand to stay at or near the normal threshold 80 (which may represent the number of packets needed to ensure that proper synchronization can be achieved given the current channel conditions). As the number of audio packets deviates from this level, the client unit 26 can transmit rate control information to server 12 to cause the server to transmit more or fewer audio packets as required.”
The Nagashima Patent which is assigned to Yamaha Corporation discloses a session apparatus, control method therefor, and program for implementing the control method. Specifically, the patent provides “there is provided a session apparatus that enables the user to freely start and enjoy a music session with another session apparatus without being restricted by a time the session should be started. A session apparatus is connected to at least one other session apparatus via a communication network in order to perform a music session with the other session apparatus. Reproduction data to be reproduced simultaneously with reproduction data received from the other session apparatuses is generated and transmitted to the other session apparatus. The reproduction data received from the other session apparatus is delayed by a period of time required for the received reproduction data to be reproduced in synchronism with the generated reproduction data, for simultaneous reproduction of the delayed reproduction data and the generated reproduction data.”
The Spilo Published Patent Application is a method and system for synchronization of digital media. Specifically, synchronization is accomplished by a process which approximate the arrival time of a packet containing audio and/or video digital content across the network and instruct the playback devices as to when playback is to begin, and at what point in the streaming media content signal to begin playback. One method uses a timestamp packet on the network to synchronize all players.
The Spilo Published Patent Application references TCP/IP which can not be used for real-time communications. The Spilo Published patent application claims to use synchronized audio, even using multicast (UDP) is not possible. Transmission Control Protocol (TCP) and the Internet Protocol (IP) is by its very nature error-correcting so that if a packet in the stream is lot, TCP stops the stream and requests a rebroadcast of the missing packet.
Spilo only mentions using time-of-day for the main timing mechanism. NTP is not used as the Predictive Successive Approximations of the time-of-day. Due to the fluctuating nature of interne bandwidth, this would prove to cause a larger packet loss. Spilo only uses master time stamp and passes that time stamp to all clients. Spilo implies that all traffic goes to a centralized server. Spilo only mentions the use of MPEG encoder/decoders. No other information is available on this matter.
The Fellman Published Patent Application is for a method and system for providing site independent real-time multimedia transport over packet-switched networks. The patent discloses that site independence is achieved by measuring and accounting for the jitter and delay between a transmitter and receiver based on the particular path between the transmitter and receiver independent of site location. The transmitter inserts timestamps and sequence numbers into packets and then transmits from them. A receiver uses these timestamps to recover the transmitter's clock. The receiver stores the packets in a buffer that orders them by sequence number. The packets stay in the buffer for a fixed latency to compensate for possible network jitter and/or packet reordering. The combination of timestamp packet-processing, remote clock recovery and synchronization, fixed-latency receiver buffering, and error correction mechanisms help to preserve the quality of the received video, despite the significant network impairments generally encountered throughout the interne and wireless networks.
The '462 Tsunoda Patent discloses real time communications of musical tone information. Specifically, Column 2 of the patent beginning on Line 23 states:                “According to further aspect of the present invention, there is provided a communication system having a plurality of communications apparatuses each having receiving means and transmitting means, wherein: the receiving means of the plurality of communications apparatuses receive the same data; the transmitting means of the plurality of communications apparatuses can reduce the amount of data received by the receiving means and can transmit the reduced data; and the data reduced by one of the communications apparatuses is different form the data reduced by another of the communications apparatuses.        
Since the data reduced by one and another of communications apparatuses is different, the quality of data transmitted from each communication apparatus is different. For example, the type or reduction factor of the reduced data may be made different at each communication apparatus. Therefore, a user can obtain data of a desired quality by accessing a proper communication apparatus.
According to still another aspect of the invention, there is provided a musical tone data communications method comprising the steps of: (a) transmitting MIDI data over a communications network; and (b) receiving the transmitted, the recovery data indicating a continuation of transmission of the MIDI data.”
The Both Published Patent Application was published in June 2006. It discloses a system and method for video assisted music instrument collaboration over distance. Claim 1 reads as follows:                “A system for enabling a musician at one location to play a music instrument and have the played music recreated by a music instrument at another location, comprising:        at least first and second end points, the first end point being connectable to the second end point through a data network, each end point comprising:        a music instrument capable of transmitting music data representing music played on the instrument and capable of receiving music played on the instrument and capable of receiving music data representing music to be played on the instrument;        a video conferencing system capable of exchanging video and audio information with the video conferencing system of another end point through the data network; and a music processing engine connected to the data network and the music instrument and having a user interface, the music processing engine being operable to receive music data from the instrument at the end point and to timestamp the receipt of the music data with a clock synchronized with end points in the system, to transmit the received music data with the timestamp to another end point in the system via the data network, to receive from the data network music data including timestamps from another end point and the buffer the received music data for a selected delay period and in the order indicated by the timestamps in the received music data and to forward the ordered music data, after the selected delay period to the music instrument connected to the end point to play the music represented by the music data.”        The '362 Tsunoda Patent was issued in July 2006 and is assigned to Yamaha Corporation. For purposes of relevance, the same information quoted in the previous Tsunoda Patent is relevant to this Tsunoda Patent.        
For the Redmann Published Patent Application, UDP/IP in 28 byte packets is used. This also implies that the sampling rate is 28 k, which is unusable in studio situations. In addition packets must be at multiple of the sample rate in order to not cause a network saturation effect. The Encoder/Decoder will lose synchronization at that rate Redmann uses “RTT” or Round Trip Time. This refers to IMCP pin which is good to get a general time, however each time an IMCP ping is sent it can use a completely different path to and from each client. Therefore at best ping you will get an average variance that will cause an unacceptable amount of lag/led.
In Redmann, the initiator playback device transmits to target playback device a message containing either a prediction of the target's internal time-of-day clock at receipt of the packet or the initiator's time-of-day. NTP from a single source is used. Redmann implies that all traffic goes to a centralized server. Redmann fades out of the last packet. UDP packets have a beginning but do not need an end. They state that they extend the end of one to overlap or “scale” with the next, the using the lost packet material as the basis for the “synthesized” replacement. This would render the application useless to studios. “Predictor-corrector algorithm” is used to create a synthesized representation for late packets and states it does not accurately reproduce late packets.