At present, the conventional video surveillance systems in the market comprise mainly the simple function of motion detection which can detect images by utilizing the principle of image subtraction. In other words, the method of the subtraction principle is based on a presumption that a camera and an image background are both in a motionless position, and the image intensities of corresponding pixels of two adjacent photos at a specific time is subtracted. If the result is greater than a user-defined threshold, then it would be considered as an indication of a moving object. However, one of disadvantages of this method is that the final result can be easily affected by the noises of the camera or the illumination changes of the environment. In addition, when the object is moving, the moving object (generally called a foreground) will cause a change in the image intensity, and, the background of the moving object also will cause a partial change in the image intensity. As a result, the subtraction result will contain both the foreground and the background, thus, the actual position of the foreground cannot be obtained. Further, when the object stops moving, the motion detection of the system cannot continue to locate the object resulting the loss of its detecting object. Moreover, since the camera must be fixed to a definite position, the viewing angle would be rigidly constrained and limited.
In applications of the conventional surveillance system, the real-time characteristic of the system is an important issue, and a standard camera can normally capture 30 frames per second. The tracking algorithm is not only required to track a target position accurately, but it is also needed to process the captured images immediately. In other words, the time required for processing an image should not exceed 33 milliseconds (ms), or else the computational burden of tracking algorithm will be heavily loaded that we have to reduce the number of captured images and the data volume of the system will be insufficient to resolve the foregoing problems. As a result, the tracking of the moving object will fail whenever the object is moving too fast.
In related prior arts as disclosed in R.O.C. Pat. Nos. I233061 and M290286, the conventional surveillance systems teaches of creating a target model either through the image intensity or the result of the image intensity undergoing different transformations, and then apply a template matching method to proceed the tracking process. However, the information of the image intensity is easily affected by various camera noises or the changes of illumination in various environmental conditions, in which can decrease the accuracy of the tracking rate drastically.
In addition, another prior art disclosed in R.O.C. Pat. Publication No. 200744370 simply teaches the usage of an image edge matching by assuming that a target contour is in a fixed elliptic shape, and using a fuzzy theory to control a platform in order to have smooth movements of the platform. Therefore, the conventional surveillance system utilizes the fuzzy theory for the purpose of controlling, but it cannot effectively improve the accuracy of the visual tracking of a target position. Further, the probability data association filter of the prior arts indicates that the conventional systems mostly use the trajectory of the object to predict and calculate the possible area of the target. However, any similar target falling in the possible area can be used for calculating a weighted average, so that if there is a possible result with a low score but close to the predicted position, the final tracking result will be affected significantly.
Therefore, we urgently need an improved tracking system and a method that can overcome the above-mentioned obstacles, such as the complicated background issue, the tracking difficulty caused by a change of illumination and how to track a target timely and accurately, such that if an object intrudes a video security system, the maximum information of the object can be recorded.