There are needs for remote monitoring or remote detection in various consumer and industrial fields. Currently, there are primarily two types of remote monitoring. The first method involves using a specialized audio and video capture device, such as a voice recognition device and a video camera, to capture audio and video signals, transmit the collected signals through a proprietary link to a monitoring device, such as a personal computer, and use the monitoring device to further process the received signals. The second method involves using a conventional mobile device (excluding mobile devices with a Web operating system) for remote monitoring. The second method uses conventional mobile devices' video and audio input capability to capture audio and video signals, and separately transmits the captured signals to another monitoring device such as a personal computer over a network, and then processes the received signals by a monitoring device.
However, the above-described methods have shortcomings. Remote monitoring using specialized audio and video equipment requiring the user to purchase and configure special audio and video equipment, which is resource consuming and costly, and therefore not conducive to universal remote monitoring applications. Similarly, using a conventional mobile device for remote monitoring requires application developments on the conventional mobile device that are compatible with the unique technologies of the conventional mobile device in order to implement the business logic of remote monitoring, and also requires special monitoring equipment (a receiver) that are compatible with the conventional mobile device. As a result, the method is only applicable to special applications developed for the monitoring or detection of specific scenes. The prior art methods involve high level technical difficulties in product development, and offer no simple method to extend and expand the scope of application scenarios.
Therefore, there is a need for a method that has a lower application threshold, particularly one that is possible with the use of existing audio and video capture capabilities of the available devices, in order to avoid the high costs, high degree of technical difficulties and other issues in the prior art.